Overview

The synapseutils package provides both higher level functions as well as utilities for interacting with Synapse. These functionalities include:

copy

  • This function will assist users in copying entities (Tables, Links, Files, Folders, Projects), and will recursively copy everything in directories.

  • A Mapping of the old entities to the new entities will be created and all the wikis of each entity will also be copied over and links to synapse Ids will be updated.

param syn

A synapse object: syn = synapseclient.login()- Must be logged into synapse

param entity

A synapse entity ID

param destinationId

Synapse ID of a folder/project that the copied entity is being copied to

param skipCopyWikiPage

Skip copying the wiki pages Default is False

param skipCopyAnnotations

Skips copying the annotations Default is False

Examples:: import synapseutils import synapseclient syn = synapseclient.login() synapseutils.copy(syn, …)

Examples and extra parameters unique to each copy function – COPYING FILES

param version

Can specify version of a file. Default to None

param updateExisting

When the destination has an entity that has the same name, users can choose to update that entity. It must be the same entity type Default to False

param setProvenance

Has three values to set the provenance of the copied entity: traceback: Sets to the source entity existing: Sets to source entity’s original provenance (if it exists) None: No provenance is set

Examples::

synapseutils.copy(syn, “syn12345”, “syn45678”, updateExisting=False, setProvenance = “traceback”,version=None)

– COPYING FOLDERS/PROJECTS

param excludeTypes

Accepts a list of entity types (file, table, link) which determines which entity types to not copy. Defaults to an empty list.

Examples:: #This will copy everything in the project into the destinationId except files and tables. synapseutils.copy(syn, “syn123450”,”syn345678”,excludeTypes=[“file”,”table”])

returns

a mapping between the original and copied entity: {‘syn1234’:’syn33455’}

walk

synapseutils.walk.walk(syn, synId)

Traverse through the hierarchy of files and folders stored under the synId. Has the same behavior as os.walk()

Parameters
  • syn – A synapse object: syn = synapseclient.login()- Must be logged into synapse

  • synId – A synapse ID of a folder or project

Example:

walkedPath = walk(syn, "syn1234")

for dirpath, dirname, filename in walkedPath:
    print(dirpath)
    print(dirname) #All the folders in the directory path
    print(filename) #All the files in the directory path

sync

synapseutils.sync.generateManifest(syn, allFiles, filename, provenance_cache=None)

Generates a manifest file based on a list of entities objects.

Parameters
  • allFiles – A list of File Entities

  • filename – file where manifest will be written

  • provenance_cache – an optional dict of known provenance dicts keyed by entity ids

synapseutils.sync.readManifestFile(syn, manifestFile)

Verifies a file manifest and returns a reordered dataframe ready for upload.

Parameters
  • syn – A synapse object as obtained with syn = synapseclient.login()

  • manifestFile – A tsv file with file locations and metadata to be pushed to Synapse. See below for details

:returns A pandas dataframe if the manifest is validated.

See also for a description of the file format:
synapseutils.sync.syncFromSynapse(syn, entity, path=None, ifcollision='overwrite.local', allFiles=None, followLink=False)

Synchronizes all the files in a folder (including subfolders) from Synapse and adds a readme manifest with file metadata.

Parameters
  • syn – A synapse object as obtained with syn = synapseclient.login()

  • entity – A Synapse ID, a Synapse Entity object of type file, folder or project.

  • path – An optional path where the file hierarchy will be reproduced. If not specified the files will by default be placed in the synapseCache.

  • ifcollision – Determines how to handle file collisions. Maybe “overwrite.local”, “keep.local”, or “keep.both”. Defaults to “overwrite.local”.

  • followLink – Determines whether the link returns the target Entity. Defaults to False

Returns

list of entities (files, tables, links)

This function will crawl all subfolders of the project/folder specified by entity and download all files that have not already been downloaded. If there are newer files in Synapse (or a local file has been edited outside of the cache) since the last download then local the file will be replaced by the new file unless “ifcollision” is changed.

If the files are being downloaded to a specific location outside of the Synapse cache a file (SYNAPSE_METADATA_MANIFEST.tsv) will also be added in the path that contains the metadata (annotations, storage location and provenance of all downloaded files).

See also: - synapseutils.sync.syncToSynapse()

Example: Download and print the paths of all downloaded files:

entities = syncFromSynapse(syn, "syn1234")
for f in entities:
    print(f.path)
synapseutils.sync.syncToSynapse(syn, manifestFile, dryRun=False, sendMessages=True, retries=4)

Synchronizes files specified in the manifest file to Synapse

Parameters
  • syn – A synapse object as obtained with syn = synapseclient.login()

  • manifestFile – A tsv file with file locations and metadata to be pushed to Synapse. See below for details

  • dryRun – Performs validation without uploading if set to True (default is False)

Given a file describing all of the uploads uploads the content to Synapse and optionally notifies you via Synapse messagging (email) at specific intervals, on errors and on completion.

Manifest file format

The format of the manifest file is a tab delimited file with one row per file to upload and columns describing the file. The minimum required columns are path and parent where path is the local file path and parent is the Synapse Id of the project or folder where the file is uploaded to. In addition to these columns you can specify any of the parameters to the File constructor (name, synapseStore, contentType) as well as parameters to the syn.store command (used, executed, activityName, activityDescription, forceVersion). Used and executed can be semi-colon (“;”) separated lists of Synapse ids, urls and/or local filepaths of files already stored in Synapse (or being stored in Synapse by the manifest). Any additional columns will be added as annotations.

Required fields:

Field

Meaning

Example

path

local file path or URL

/path/to/local/file.txt

parent

synapse id

syn1235

Common fields:

Field

Meaning

Example

name

name of file in Synapse

Example_file

forceVersion

whether to update version

False

Provenance fields:

Field

Meaning

Example

used

List of items used to generate file

syn1235; /path/to_local/file.txt

executed

List of items exectued

https://github.org/; /path/to_local/code.py

activityName

Name of activity in provenance

“Ran normalization”

activityDescription

Text description on what was done

“Ran algorithm xyx with parameters…”

Annotations:

Annotations:

Any columns that are not in the reserved names described above will be interpreted as annotations of the file

Other optional fields:

Field

Meaning

Example

synapseStore

Boolean describing whether to upload files

True

contentType

content type of file to overload defaults

text/html

Example manifest file

path

parent

annot1

annot2

used

executed

/path/file1.txt

syn1243

“bar”

3.1415

“syn124; /path/file2.txt”

https://github.org/foo/bar

/path/file2.txt

syn12433

“baz”

2.71

“”

https://github.org/foo/baz

monitor

synapseutils.monitor.notifyMe(syn, messageSubject='', retries=0)

Function decorator that notifies you via email whenever an function completes running or there is a failure.

Parameters
  • syn – A synapse object as obtained with syn = synapseclient.login()

  • messageSubject – A string with subject line for sent out messages.

  • retries – Number of retries to attempt on failure (default=0)

Example:

# to decorate a function that you define
from synapseutils import notifyMe
import synapseclient
syn = synapseclient.login()

@notifyMe(syn, 'Long running function', retries=2)
def my_function(x):
    doing_something()
    return long_runtime_func(x)

my_function(123)

#############################
# to wrap a function that already exists
from synapseutils import notifyMe
import synapseclient
syn = synapseclient.login()

notify_decorator = notifyMe(syn, 'Long running query', retries=2)
my_query = notify_decorator(syn.tableQuery)
results = my_query("select id from syn1223")

#############################
synapseutils.monitor.with_progress_bar(func, totalCalls, prefix='', postfix='', isBytes=False)

Wraps a function to add a progress bar based on the number of calls to that function.

Parameters
  • func – Function being wrapped with progress Bar

  • totalCalls – total number of items/bytes when completed

  • prefix – String printed before progress bar

  • prefix – String printed after progress bar

  • isBytes – A boolean indicating weather to convert bytes to kB, MB, GB etc.

Returns

a wrapped function that contains a progress bar