foundry package — Foundry_test 1.1 documentation - HTML AUTOGENERATION

foundry.foundry module

class foundry.foundry.Foundry(no_browser=False, no_local_server=False, search_index='mdf-test', *, dc: Dict = {}, mdf: Dict = {}, dataset: foundry.models.FoundryDataset = {}, config: foundry.models.FoundryConfig = FoundryConfig(dataframe_file='foundry_dataframe.json', data_file='foundry.hdf5', metadata_file='foundry_metadata.json', destination_endpoint=None, local=False, metadata_key='foundry', organization='foundry', local_cache_dir='./data'), dlhub_client: Any = None, forge_client: Any = None, connect_client: Any = None, xtract_tokens: Any = None)

Bases: foundry.models.FoundryMetadata

Foundry Client Base Class TODO: ——- Add Docstring build(spec, globus=False, interval=3, file=False)

Build a Foundry Data Package :param spec: dict or str (relative filename) of the data package specification :type spec: multiple :param globus: if True use Globus to fetch datasets :type globus: bool :param interval: Polling interval on checking task status in seconds. :type interval: int :param type: One of “file” or None :type type: strReturns

**(Foundry)**Return type

self: for chaining check_model_status(res)

Check status of model or function publication to DLHub

TODO: currently broken on DLHub side of things check_status(source_id, short=False, raw=False)

Check the status of your submission.Parameters

  • source_id (str) – The source_id (source_name + version information) of the submission to check. Returned in the res result from publish() via MDF Connect Client.

  • short (bool) – When False, will print a status summary containing all of the status steps for the dataset. When True, will print a short finished/processing message, useful for checking many datasets’ status at once. Default: False

  • raw (bool) – When False, will print a nicely-formatted status summary. When True, will return the full status result. For direct human consumption, False is recommended. Default: False

Returns

The full status result.Return type

If raw is True, dict collect_dataframes(packages=[])

Collect dataframes of local data packages :param packages: List of packages to collect, defaults to all :type packages: listReturns

**(tuple)**Return type

Tuple of X(pandas.DataFrame), y(pandas.DataFrame) configure(**kwargs)

Set Foundry config :keyword file: Path to the file containing :kwtype file: str :keyword (default: self.config.metadata_file)

dataframe_file (str): filename for the dataframe file default:”foundry_dataframe.json” data_file (str): : filename for the data file default:”foundry.hdf5” destination_endpoint (str): Globus endpoint UUID where Foundry data should move local_cache_dir (str): Where to place collected data default:”./data”Returns

**(Foundry)**Return type

self: for chaining connect_client_: Any_ describe_model() dlhub_client_: Any_ download(globus=True, verbose=False, **kwargs)

Download a Foundry dataset :param globus: if True, use Globus to download the data else try HTTPS :type globus: bool :param verbose: if True print out debug information during the download :type verbose: boolReturns

**(Foundry)**Return type

self: for chaining forge_client_: Any_ get_keys(type, as_object=False)

Get keys for a Foundry datasetParameters

  • type (str) – The type of key to be returned e.g., “input”, “target”

  • as_object (bool) – When False, will return a list of keys in as strings When True, will return the full key objects Default: False

Returns: (list) String representations of keys or if as_object

is False otherwise returns the full key objects. get_packages(paths=False)

Get available local data packagesParameters

paths (bool) – If True return paths in addition to package, if False return package name onlyReturns

**(list)**Return type

List describing local Foundry packages list()

List available Foundry data packagesReturns

**(pandas.DataFrame)**Return type

DataFrame with summary list of Foundry data packages including name, title, and publication year load(name, download=True, globus=True, verbose=False, metadata=None, **kwargs)

Load the metadata for a Foundry dataset into the client :param name: Name of the foundry dataset :type name: str :param download: If True, download the data associated with the package (default is True) :type download: bool :param globus: If True, download using Globus, otherwise https :type globus: bool :param verbose: If True print additional debug information :type verbose: bool :param metadata: For debug purposes. A search result analog to prepopulate metadata. :type metadata: dictKeyword Arguments

interval (int) – How often to poll Globus to check if transfers are completeReturnsReturn type

self load_data(source_id=None, globus=True)

Load in the data associated with the prescribed dataset

Tabular Data Type: Data are arranged in a standard data frame stored in self.dataframe_file. The contents are read, and

File Data Type: <<Add desc>>

For more complicated data structures, users should subclass Foundry and override the load_data functionParameters

  • inputs (list) – List of strings for input columns

  • targets (list) – List of strings for output columns

Returns ——-s

(tuple): Tuple of X, y values

publish(foundry_metadata, data_source, title, authors, update=False, publication_year=None, **kwargs)

Submit a dataset for publication :param foundry_metadata: Dict of metadata describing data package :type foundry_metadata: dict :param data_source: Url for Globus endpoint :type data_source: string :param title: Title of data package :type title: string :param authors: List of data package author names e.g., Jack Black

or Nunez, Victoria

Parameters

  • update (bool) – True if this is an update to a prior data package (default: self.config.metadata_file)

  • publication_year (int) – Year of dataset publication. If None, will be set to the current calendar year by MDF Connect Client. (default: $current_year)

Keyword Arguments

  • affiliations (list) – List of author affiliations

  • tags (list) – List of tags to apply to the data package

  • short_name (string) – Shortened/abbreviated name of the data package

  • publisher (string) – Data publishing entity (e.g. MDF, Zenodo, etc.)

Returns

(dict) MDF Connect Response – of dataset. Contains source_id, which can be used to check the status of the submissionReturn type

Response from MDF Connect to allow tracking publish_model(options)

Submit a model or function for publication :param options: dict of all possible optionsOptions keys:

title (req) authors (req) short_name (req) servable_type (req) (“static method”, “class method”, “keras”, “pytorch”, “tensorflow”, “sklearn”) affiliations domains abstract references requirements (dict of library:version keypairs) module (if Python method) function (if Python method) inputs (not needed for TF) (dict of options) outputs (not needed for TF) methods (e.g. research methods) DOI publication_year (advanced) version (advanced) visibility (dict of users and groups, each a list) funding reference rights

TODO: alternate identifier (to add an identifier of this artifact in another service) add file add directory add files run(name, inputs, **kwargs)

Run a model on dataParameters

  • name (str) – DLHub model name

  • inputs – Data to send to DLHub as inputs (should be JSON serializable)

ReturnsReturn type

Returns results after invocation via the DLHub service

  • Pass **kwargs through to DLHub client and document kwargs

xtract_tokens_: Any_

foundry.models module

class foundry.models.FoundryConfig(*, dataframe_file: str = 'foundry_dataframe.json', data_file: str = 'foundry.hdf5', metadata_file: str = 'foundry_metadata.json', destination_endpoint: str = None, local: bool = False, metadata_key: str = 'foundry', organization: str = 'foundry', local_cache_dir: str = './data')

Bases: pydantic.main.BaseModel

Foundry Configuration Configuration information for Foundry DatasetParameters

  • dataframe_file (str) – Filename to read dataframe contents from

  • metadata_file (str) – Filename to read metadata contents from defaults to reading for MDF Discover

  • destination_endpoint (str) – Globus endpoint ID to transfer data to (defaults to local GCP installation)

  • local_cache_dir (str) – Path to local Foundry package cache

data_file_: Optional[str] dataframe_file: Optional[str] destination_endpoint: Optional[str] local: Optional[bool] metadata_file: Optional[str] metadata_key: Optional[str] organization: Optional[str] class foundry.models.FoundryDataset(*, keys: List[foundry.models.FoundryKey] = None_, splits: List[foundry.models.FoundrySplit] = None, type: foundry.models.FoundryDatasetType = None, short_name: str = '', dataframe: Any = None)

Bases: pydantic.main.BaseModel

Foundry Dataset Schema for Foundry Datasets. This includes specifications of inputs, outputs, type, version, and more class Config

Bases: object arbitrary_types_allowed = True dataframe_: Optional[Any] keys: List[foundry.models.FoundryKey] short_name: Optional[str] splits: Optional[List[foundry.models.FoundrySplit]] type:_ foundry.models.FoundryDatasetType class foundry.models.FoundryDatasetType(value)

Bases: enum.Enum

Foundry Dataset Types Enumeration of the possible Foundry dataset types files = 'files' hdf5 = 'hdf5' other = 'other' tabular = 'tabular' class foundry.models.FoundryKey(*, key: List[str] = [], type: str = '', filter: str = '', units: str = '', description: str = '', classes: List[foundry.models.FoundryKeyClass] = None)

Bases: pydantic.main.BaseModel classes_: Optional[List[foundry.models.FoundryKeyClass]] description: Optional[str] filter: Optional[str] key: List[str] type: str_ units_: Optional[str] class foundry.models.FoundryKeyClass(*_, label: str = '', name: str = '')

Bases: pydantic.main.BaseModel label_: str_ name_: str_ class foundry.models.FoundryMetadata(*, dc: Dict = {}, mdf: Dict = {}, dataset: foundry.models.FoundryDataset = {}, config: foundry.models.FoundryConfig = FoundryConfig(dataframe_file='foundry_dataframe.json', data_file='foundry.hdf5', metadata_file='foundry_metadata.json', destination_endpoint=None, local=False, metadata_key='foundry', organization='foundry', local_cache_dir='./data'))

Bases: pydantic.main.BaseModel class Config

Bases: object arbitrary_types_allowed = True config_:_ foundry.models.FoundryConfig dataset_:_ foundry.models.FoundryDataset dc_: Optional[Dict] mdf: Optional[Dict] class foundry.models.FoundrySpecification(*_, name: str = '', version: str = '', description: str = '', private: bool = False, dependencies: Any = None)

Bases: pydantic.main.BaseModel

Pydantic base class for interacting with the Foundry data package specification The specification provides a way to group datasets and manage versions add_dependency(name, version) clear_dependencies() dependencies_: Any_ description_: str_ name_: str_ private_: bool_ remove_duplicate_dependencies() version_: str_ class foundry.models.FoundrySpecificationDataset(*, name: str = None, provider: str = 'MDF', version: str = None)

Bases: pydantic.main.BaseModel

Pydantic base class for datasets within the Foundry data package specification name_: Optional[str] provider: Optional[str] version: Optional[str] class foundry.models.FoundrySplit(*_, type: str = '', path: str = '', label: str = '')

Bases: pydantic.main.BaseModel label_: Optional[str] path: Optional[str] type: str_

foundry.xtract_method module

Last updated