opalalgorithms.utils.algorithmrunner¶
Algorithm runner class to run algorithms during computation.
Given an algorithm object, run the algorithm.
-
opalalgorithms.utils.algorithmrunner.
mapper
(writing_queue, params, file_queue, algorithm, dev_mode=False, sandboxing=True, python_version=2)[source]¶ Call the map function and insert result into the queue if valid.
Parameters: - writing_queue (mp.manager.Queue) – Queue for inserting results.
- params (dict) – Parameters to be used by each map of the algorithm.
- users_csv_files (list) – List of paths of csv files of users.
- algorithm (dict) – Dictionary with keys code and className specifying algorithm code and className.
- dev_mode (bool) – Should the algorithm run in development mode or production mode.
- sandboxing (bool) – Should sandboxing be used or not.
- python_version (int) – Python version being used for sandboxing.
-
opalalgorithms.utils.algorithmrunner.
collector
(writing_queue, params, dev_mode=False)[source]¶ Collect the results in writing queue and post to aggregator.
Parameters: Returns: True on successful exit if dev_mode is set to False.
Return type: Note
If dev_mode is set to true, then collector will just return all the results in a list format.
-
opalalgorithms.utils.algorithmrunner.
is_valid_result
(result)[source]¶ Check if result is valid.
Parameters: result – Output of the algorithm. Note
Result is valid if it is a dict. All keys of the dict must be be a string. All values must be numbers. These results are sent to reducer which will sum, count, mean, median, mode of the values belonging to same key.
- Example:
- {“alpha1”: 1, “ant199”: 1, ..}
Returns: Specifying if the result is valid or not. Return type: bool Todo
- Define what is valid with privacy and other concerns
-
opalalgorithms.utils.algorithmrunner.
process_user_csv
(params, user_csv_file, algorithm, dev_mode, sandboxing, jail)[source]¶ Process a single user csv file.
Parameters: - params (dict) – Parameters for the request.
- user_csv_file (string) – Path to user csv file.
- algorithm (dict) – Dictionary with keys code and className specifying algorithm code and className.
- dev_mode (bool) – Should the algorithm run in development mode or production mode.
- sandboxing (bool) – Should sandboxing be used or not.
- jail (codejail.Jail) – Jail object.
Returns: Result of the execution.
Raises: SafeExecException
– If the execution wasn’t successful.
-
opalalgorithms.utils.algorithmrunner.
get_jail
(python_version=2)[source]¶ Return codejail object.
Note
- Please set environmental variables OPALALGO_SANDBOX_VENV
- and OPALALGO_SANDBOX_USER before calling this function.
- OPALALGO_SANDBOX_VENV must be set to the path of the sandbox
- virtual environment.
- OPALALGO_SANDBOX_USER must be set to the user running the
- sandboxed algorithms.
-
class
opalalgorithms.utils.algorithmrunner.
AlgorithmRunner
(algorithm, dev_mode=False, multiprocess=True, sandboxing=True)[source]¶ Algorithm runner.
Parameters: -
__call__
(params, data_dir, num_threads, weights_file=None)[source]¶ Run algorithm.
Selects the csv files from the data directory. Divides the csv files into chunks of equal size across the num_threads threads. Each thread performs calls map function of the csv file and processes the result. The collector thread, waits for results before posting it to aggregator service.
Parameters: Returns: Amount of time required for computation in microseconds.
Return type:
-