start_run¶
- tayph.start_run(configfile, parallel=True, xcor_parallel=False, dp='', debug=False)[source] [edit on github]¶
This is the main command-line initializer of the cross-correlation routine provided by Tayph. It parses a configuration file located at configfile which should contain predefined keywords. These keywords are passed on to the run_instance routine, which executes the analysis cascade. A call to this function is called a “run” of Tayph. A run has this configuration file as input, as well as a dataset, a (list of) cross-correlation template(s) and a (list of) models for injection purposes.
Various routines in the main analysis cascade have been parallelised as of May 2021, allowing for a great speed up on systems that support multi-threading. In case simple parallelisation via the joblib package is not possible on your system, parallel computation can be switched off by setting parallel=False.
Parallelisation of the cross-correlation function is handled separately, because it is highly demanding on the available memory, with each template doppler-shifted to all radial velocity samples having to be loaded into memory at once. Memory usage scales with the number of templates, the number of spectral pixels per template, and the number of cross-correlation steps ( = RVrange / dv).
As an example, the templates provided along with the demo data weigh 2.8 MB each when being interpolated onto the wavelength grid of the data. That’s 1.4GB for 500 RV points (e.g. 250 km/s on either side for 1 km/s velocity steps, close to the bare minimum you would need). Doing more than 5 such templates in parallel will overflow the memory of a standard laptop (8GB RAM). Realisticly, you might be computing 8 or 16 templates in parallel (depending on the number of threads you have), with 1000 steps in velocity, meaning that you’d need 20 to 40 GB of RAM. That’s in the realm of servers.
Memory overflow won’t necessarily make your system crash, as long as it has allocated sufficient swap memory. However, that does make your calculation very slow, potentially slower than doing the computation in sequence. Therefore, parallelisation of the cross-correlation operation is provided via the xcor_parallel keyword, and is switched off by default. In case you are running Tayph on a server with many cores and plenty of RAM, switching this on may effect speed gains of factors of 5 to 10 in cross-correlation.
Set the dp keyword to an alternative datapath to override the datapath in the runfile. This is to execute the same runfile on multiple datasets (e.g. nights) in a straightforward way (i.e.) a loop).
Set the debug to run the cascade in a stepwise manner, calling pdb.set_trace() after each step to allow on the fly inspection of variables and errors.