Skip to content

File

synapseclient.entity.File

Bases: Entity, Versionable

Represents a file in Synapse.

When a File object is stored, the associated local file or its URL will be stored in Synapse. A File must have a path (or URL) and a parent. By default, the name of the file in Synapse matches the filename, but by specifying the name attribute, the File Entity name can be different.

Changing File Names

A Synapse File Entity has a name separate from the name of the actual file it represents. When a file is uploaded to Synapse, its filename is fixed, even though the name of the entity can be changed at any time. Synapse provides a way to change this filename and the content-type of the file for future downloads by creating a new version of the file with a modified copy of itself. This can be done with the synapseutils.copy_functions.changeFileMetaData function.

import synapseutils
e = syn.get(synid)
print(os.path.basename(e.path))  ## prints, e.g., "my_file.txt"
e = synapseutils.changeFileMetaData(syn, e, "my_newname_file.txt")

Setting fileNameOverride will not change the name of a copy of the file that's already downloaded into your local cache. Either rename the local copy manually or remove it from the cache and re-download.:

syn.cache.remove(e.dataFileHandleId)
e = syn.get(e)
print(os.path.basename(e.path))  ## prints "my_newname_file.txt"
PARAMETER DESCRIPTION
path

Location to be represented by this File

DEFAULT: None

name

Name of the file in Synapse, not to be confused with the name within the path

parent

Project or Folder where this File is stored

DEFAULT: None

synapseStore

Whether the File should be uploaded or if only the path should be stored when synapseclient.Synapse.store is called on the File object.

DEFAULT: True

contentType

Manually specify Content-type header, for example "application/png" or "application/json; charset=UTF-8"

dataFileHandleId

Defining an existing dataFileHandleId will use the existing dataFileHandleId The creator of the file must also be the owner of the dataFileHandleId to have permission to store the file.

properties

A map of Synapse properties

DEFAULT: None

annotations

A map of user defined annotations

DEFAULT: None

local_state

Internal use only

DEFAULT: None

Creating instances

Creating and storing a File

# The Entity name is derived from the path and is 'data.xyz'
data = File('/path/to/file/data.xyz', parent=folder)
data = syn.store(data)

Setting the name of the file in Synapse to 'my entity'

# The Entity name is specified as 'my entity'
data = File('/path/to/file/data.xyz', name="my entity", parent=folder)
data = syn.store(data)
Source code in synapseclient/entity.py
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
class File(Entity, Versionable):
    """
    Represents a file in Synapse.

    When a File object is stored, the associated local file or its URL will be stored in Synapse.
    A File must have a path (or URL) and a parent. By default, the name of the file in Synapse
    matches the filename, but by specifying the `name` attribute, the File Entity name can be different.

    ## Changing File Names

    A Synapse File Entity has a name separate from the name of the actual file it represents.
    When a file is uploaded to Synapse, its filename is fixed, even though the name of the entity
    can be changed at any time. Synapse provides a way to change this filename and the
    content-type of the file for future downloads by creating a new version of the file
    with a modified copy of itself. This can be done with the
    [synapseutils.copy_functions.changeFileMetaData][] function.

        import synapseutils
        e = syn.get(synid)
        print(os.path.basename(e.path))  ## prints, e.g., "my_file.txt"
        e = synapseutils.changeFileMetaData(syn, e, "my_newname_file.txt")

    Setting *fileNameOverride* will **not** change the name of a copy of the
    file that's already downloaded into your local cache. Either rename the
    local copy manually or remove it from the cache and re-download.:

        syn.cache.remove(e.dataFileHandleId)
        e = syn.get(e)
        print(os.path.basename(e.path))  ## prints "my_newname_file.txt"

    Parameters:
        path: Location to be represented by this File
        name: Name of the file in Synapse, not to be confused with the name within the path
        parent: Project or Folder where this File is stored
        synapseStore: Whether the File should be uploaded or if only the path should
                        be stored when [synapseclient.Synapse.store][] is called on the File object.
        contentType: Manually specify Content-type header, for example "application/png" or
                        "application/json; charset=UTF-8"
        dataFileHandleId: Defining an existing dataFileHandleId will use the existing dataFileHandleId
                            The creator of the file must also be the owner of the dataFileHandleId
                            to have permission to store the file.
        properties: A map of Synapse properties
        annotations: A map of user defined annotations
        local_state: Internal use only

    Example: Creating instances
        Creating and storing a File

            # The Entity name is derived from the path and is 'data.xyz'
            data = File('/path/to/file/data.xyz', parent=folder)
            data = syn.store(data)

        Setting the name of the file in Synapse to 'my entity'

            # The Entity name is specified as 'my entity'
            data = File('/path/to/file/data.xyz', name="my entity", parent=folder)
            data = syn.store(data)
    """

    # Note: externalURL technically should not be in the keys since it's only a field/member variable of
    # ExternalFileHandle, but for backwards compatibility it's included
    _file_handle_keys = [
        "createdOn",
        "id",
        "concreteType",
        "contentSize",
        "createdBy",
        "etag",
        "fileName",
        "contentType",
        "contentMd5",
        "storageLocationId",
        "externalURL",
    ]
    # Used for backwards compatability. The keys found below used to located in the entity's local_state
    # (i.e. __dict__).
    _file_handle_aliases = {
        "md5": "contentMd5",
        "externalURL": "externalURL",
        "fileSize": "contentSize",
        "contentType": "contentType",
    }
    _file_handle_aliases_inverse = {v: k for k, v in _file_handle_aliases.items()}

    _property_keys = (
        Entity._property_keys + Versionable._property_keys + ["dataFileHandleId"]
    )
    _local_keys = Entity._local_keys + [
        "path",
        "cacheDir",
        "files",
        "synapseStore",
        "_file_handle",
    ]
    _synapse_entity_type = "org.sagebionetworks.repo.model.FileEntity"

    # TODO: File(path="/path/to/file", synapseStore=True, parentId="syn101")
    def __init__(
        self,
        path=None,
        parent=None,
        synapseStore=True,
        properties=None,
        annotations=None,
        local_state=None,
        **kwargs,
    ):
        if path and "name" not in kwargs:
            kwargs["name"] = utils.guess_file_name(path)
        self.__dict__["path"] = path
        if path:
            cacheDir, basename = os.path.split(path)
            self.__dict__["cacheDir"] = cacheDir
            self.__dict__["files"] = [basename]
        else:
            self.__dict__["cacheDir"] = None
            self.__dict__["files"] = []
        self.__dict__["synapseStore"] = synapseStore

        # pop the _file_handle from local properties because it is handled differently from other local_state
        self._update_file_handle(
            local_state.pop("_file_handle", None) if (local_state is not None) else None
        )

        super(File, self).__init__(
            concreteType=File._synapse_entity_type,
            properties=properties,
            annotations=annotations,
            local_state=local_state,
            parent=parent,
            **kwargs,
        )

    def _update_file_handle(self, file_handle_update_dict=None):
        """Sets the file handle. Should not need to be called by users.

        Args:
            file_handle_update_dict: A dictionary containing the file handle information.
        """

        # replace the file handle dict
        fh_dict = (
            DictObject(file_handle_update_dict)
            if file_handle_update_dict is not None
            else DictObject()
        )
        self.__dict__["_file_handle"] = fh_dict

        if (
            file_handle_update_dict is not None
            and file_handle_update_dict.get("concreteType")
            == "org.sagebionetworks.repo.model.file.ExternalFileHandle"
            and urllib_parse.urlparse(file_handle_update_dict.get("externalURL")).scheme
            != "sftp"
        ):
            self.__dict__["synapseStore"] = False

        # initialize all nonexistent keys to have value of None
        for key in self.__class__._file_handle_keys:
            if key not in fh_dict:
                fh_dict[key] = None

    def __setitem__(self, key, value):
        if key == "_file_handle":
            self._update_file_handle(value)
        elif key in self.__class__._file_handle_aliases:
            self._file_handle[self.__class__._file_handle_aliases[key]] = value
        else:

            def expand_and_convert_to_URL(path):
                return utils.as_url(os.path.expandvars(os.path.expanduser(path)))

            # hacky solution to allowing immediate switching into a ExternalFileHandle pointing to the current path
            # yes, there is boolean zen but I feel like it is easier to read/understand this way
            if (
                key == "synapseStore"
                and value is False
                and self["synapseStore"] is True
                and utils.caller_module_name(inspect.currentframe()) != "client"
            ):
                self["externalURL"] = expand_and_convert_to_URL(self["path"])

            # hacky solution because we historically allowed modifying 'path' to indicate wanting to change to a new
            # ExternalFileHandle
            # don't change exernalURL if it's just the synapseclient setting metadata after a function call such as
            # syn.get()
            if (
                key == "path"
                and not self["synapseStore"]
                and utils.caller_module_name(inspect.currentframe()) != "client"
                and utils.caller_module_name(inspect.currentframe())
                != "download_functions"
            ):
                self["externalURL"] = expand_and_convert_to_URL(value)
                self["contentMd5"] = None
                self["contentSize"] = None
            super(File, self).__setitem__(key, value)

    def __getitem__(self, item):
        if item in self.__class__._file_handle_aliases:
            return self._file_handle[self.__class__._file_handle_aliases[item]]
        else:
            return super(File, self).__getitem__(item)

    def _str_localstate(self, f):
        self._write_kvps(
            f,
            self._file_handle,
            lambda key: key
            in ["externalURL", "contentMd5", "contentSize", "contentType"],
            self._file_handle_aliases_inverse,
        )
        self._write_kvps(
            f,
            self.__dict__,
            lambda key: not (
                key in ["properties", "annotations", "_file_handle"]
                or key.startswith("__")
            ),
        )