Spectra Module

The module pyspectools.spectra contains most of the high-level interface one needs to analyze broadband spectra. There are two submodules, analysis and assignment, the former contains lower-level functionality for performing certain analysis routines (e.g. peak finding), while the latter contains classes that implements most of the user interaction.

Submodules

pyspectools.spectra.analysis module

pyspectools.spectra.analysis.average_spectra(*arrays, **options) → numpy.ndarray[source]

Averages multiple spectra together, with a few options available to the user. User provides a iterable of arrays, where the frequency axis of each of the arrays are the same, and the length of the arrays are also the same.

Options include performing a noise-weighted average, where the spectra are averaged based on their inverse of their average baseline determined by an ALS fit. The lower the average noise, the larger weighting given to the spectrum.

Parameters
  • - np.ndarray (arrays) – Iterable of NumPy arrays corresponding to the intensity

  • - bool (weighted) – If True, the averaging is done in the time domain, providing the input spectra are frequency domain. Defaults to False.

  • - bool – If True, weights the averaging by the average noise in each spectrum. Defaults to True.

Returns

Averaged intensities.

Return type

np.ndarray

Raises

ValueError – Error is raised if fewer than two spectra are given.

pyspectools.spectra.analysis.blank_spectrum(spectrum_df: pandas.core.frame.DataFrame, frequencies: numpy.ndarray, noise=0.0, noise_std=0.05, freq_col='Frequency', int_col='Intensity', window=1.0, df=True)[source]

Function to blank the peaks from a spectrum. Takes a iterable of frequencies, and generates an array of Gaussian noise corresponding to the average noise floor and standard deviation.

Parameters
  • - pandas DataFrame (spectrum_df) – Pandas DataFrame containing the spectral data

  • - iterable of floats (frequencies) – An iterable containing the center frequencies to blank

  • - float (window) – Average noise value for the spectrum. Typically measured by choosing a region void of spectral lines.

  • - float – Standard deviation for the spectrum noise.

  • - str (int_col) – Name of the column in spectrum_df to use for the frequency axis

  • - str – Name of the column in spectrum_df to use for the intensity axis

  • - float – Value to use for the range to blank. This region blanked corresponds to frequency+/-window.

  • - bool (df) – If True, returns a copy of the Pandas Dataframe with the blanked intensity. If False, returns a numpy 1D array corresponding to the blanked intensity.

Returns

If df is True, Pandas DataFrame with the intensity regions blanked. If df is False, numpy 1D array

Return type

new_spec - pandas DataFrame or numpy 1D array

pyspectools.spectra.analysis.bokeh_create_experiment_comparison(experiments, thres_prox=0.2, index=0, filepath=None, **kwargs)[source]

Function to create a plot comparing multiple experiments. This is a high level function that wraps the correlate_experiments function, and provides a visual and interactive view of the spectra output from this function using Bokeh.

Parameters
  • experiments

  • thres_prox

  • index

  • filepath

  • kwargs

Function that will search for possible harmonic candidates in a list of frequencies. Wraps the lower level function.

Generates every possible 4 membered combination of the frequencies, and makes a first pass filtering out unreasonable combinations.

frequencies - iterable containing floats of frequencies (ulines) maxJ - maximum value of J considered for quantum numbers dev_thres - standard deviation threshold for filtering unlikely

combinations of frequencies

prefilter - bool dictating whether or not the frequency lists

are prescreened by standard deviation. This potentially biases away from missing transitions!

results_df - pandas dataframe containing RMS information and fitted

constants

fit_results - list containing all of ModelResult objects

pyspectools.spectra.analysis.calc_line_weighting(frequency: float, catalog_df: pandas.core.frame.DataFrame, prox=5e-05, abs=True, freq_col='Frequency', int_col='Intensity')[source]

Function for calculating the weighting factor for determining the likely hood of an assignment. The weighting factor is determined by the proximity of the catalog frequency to the observed frequency, as well as the theoretical intensity if it is available. :param frequency: Observed frequency in MHz :type frequency: float :param catalog_df: Pandas dataframe containing the catalog data entries :type catalog_df: dataframe :param prox: Frequency proximity threshold :type prox: float, optional :param abs: Specifies whether argument prox is taken as the absolute value :type abs: bool

Returns

  • None – If nothing matches the frequency, returns None.

  • dataframe – If matches are found, calculate the weights and return the candidates in a dataframe.

pyspectools.spectra.analysis.cluster_AP_analysis(progression_df: pandas.core.frame.DataFrame, sil_calc=False, refit=False, **kwargs)[source]

Wrapper for the AffinityPropagation cluster method from scikit-learn.

The dataframe provided will also receive new columns: Cluster index, and Silhouette. The latter corresponds to how likely a sample is sandwiched between clusters (0), how squarely it belongs in the assigned cluster (+1), or does not belong (-1). The cluster index corresponds to which cluster the sample belongs to.

progression_df - pandas dataframe taken from the result of progression

fits

sil_calc - bool indicating whether silhouettes are calculated

after the AP model is fit

data - dict containing clustered frequencies and associated fits ap_obj - AffinityPropagation object containing all the information

as attributes.

pyspectools.spectra.analysis.copy_assignments(A, B, corr_mat)[source]

Function to copy assignments from experiment B over to experiment A. The correlation matrix argument requires the output from the correlate_experiments function.

Parameters
  • B (A,) – AssignmentSession objects, where the assignments from B are copied into A

  • corr_mat (2D array) – 2D array mask with length A x B

pyspectools.spectra.analysis.correlate_experiments(experiments, thres_prox=0.2, index=0)[source]

Function to find correlations between experiments, looking for common peaks detected in every provided experiment. This function uses by the first experiment as the base for comparison by default. Coincidences are searched for between this base and the other provided experiment, and ultimately combined to determine the common peaks.

A copy of the base experiment is returned, along with a dictionary with frequencies of correlations between a given experiment and the base.

Parameters
  • experiments (tuple-like) – Iterable list/tuple of AssignmentSession objects.

  • thres_prox (float, optional) – Proximity in frequency units for determining if peaks are the same. If thres-abs is False, this value is treated as a percentage of the center frequency.

  • index (int, optional) – Index for the experiment to use as a base for comparisons.

Returns

  • base_exp (AssignmentSession object) – A deep copy of the first experiment, with the updated spectra.

  • return_dict (dict) – Dictionary where keys correspond to the experiment number and values are 1D arrays of frequencies that are coincident

pyspectools.spectra.analysis.create_cluster_tests(cluster_dict: Dict[Any, Any], shots=25, dipole=1.0, min_dist=500.0, **kwargs)[source]

Take the output of the cluster AP analysis, and generate the FTB batch files for a targeted DR search.

Parameters
  • cluster_dict (dict) – Cluster dictionary with keys corresponding to the cluster number, and values are subdictionaries holding the frequencies associated with the cluster.

  • shots (int, optional) – Number of integration counts

  • dipole (float, optional) – Approximate dipole moment to target

  • min_dist (float, optional) – Minimum frequency difference between the cavity and DR frequencies.

  • kwargs – Additional kwargs are passed to the ftb line generation.

pyspectools.spectra.analysis.cross_correlate(a: numpy.ndarray, b: numpy.ndarray, lags=None)[source]

Cross-correlate two arrays a and b that are of equal length by lagging b with respect to a. Uses np.roll to shift b by values of lag, and appropriately zeros out “out of bounds” values.

Parameters
  • b (a,) – Arrays containing the values to cross-correlate. Must be the same length.

  • lags ([type], optional) – [description], by default None

pyspectools.spectra.analysis.detect_artifacts(frequencies: numpy.ndarray, tol=0.002)[source]

Quick one-liner function to perform a very rudimentary test for RFI. This method relies on the assumption that any frequency that is suspiciously close to an exact number (e.g. 16250.0000) is very likely an artifact.

The function will calculate the difference between each frequency and its nearest whole number, and return frequencies that are within a specified tolerance.

Parameters
  • frequencies (NumPy 1D array) – Array of frequencies to check for artifacts.

  • tol (float, optional) – Maximum tolerance to be used to check whether frequency is close enough to its rounded value, by default 2e-3

Returns

Returns the frequencies that match the specified criteria.

Return type

NumPy 1D array

pyspectools.spectra.analysis.filter_spectrum(intensity: float, window='hanning', sigma=0.5)[source]

Apply a specified window function to a signal. The window functions are taken from the signal.windows module of SciPy, so check what is available before throwing it into this function.

The window function is convolved with the signal by taking the time- domain product, and doing the inverse FFT to get the convolved spectrum back.

The one exception is the gaussian window - if a user specifies “gaussian” for the window function, the actual window function applied here is a half gaussian, i.e. a 1D gaussian blur.

Parameters
  • dataframe (pandas DataFrame) – Pandas dataframe containing the spectral information

  • int_col (str, optional) – Column name to reference the signal

  • window (str, optional) – Name of the window function as implemented in SciPy.

  • sigma

Returns

new_y – Numpy 1D array containing the convolved signal

Return type

array_like

pyspectools.spectra.analysis.find_series(combo: Tuple[float, float], frequencies: numpy.ndarray, search=0.005)[source]

Function that will exhaustively search for candidate progressions based on a pair of frequencies.

The difference of the pair is used to estimate B, which is then used to calculate J. These values of J are then used to predict the next set of lines, which are searched for in the soup of frequencies. The closest matches are added to a list which is returned.

This is done so that even if frequencies are missing a series of lines can still be considered.

combo - pair of frequencies corresponding to initial guess frequencies - array of frequencies to be searched search - optional threshold for determining the search range

to look for candidates

array of candidate frequencies

pyspectools.spectra.analysis.fit_line_profile(spec_df: pandas.core.frame.DataFrame, center: float, width=None, intensity=None, freq_col='Frequency', int_col='Intensity', fit_func=<class 'lmfit.models.GaussianModel'>, sigma=2, logger=None)[source]

Somewhat high level function that wraps lmfit for fitting Gaussian lineshapes to a spectrum.

For a given guess center and optional intensity, the

pyspectools.spectra.analysis.harmonic_finder(frequencies: numpy.ndarray, search=0.001, low_B=400.0, high_B=9000.0)[source]

Function that will generate candidates for progressions. Every possible pair combination of frequencies are looped over, consider whether or not the B value is either too small (like C60 large) or too large (you won’t have enough lines to make a progression), and search the frequencies to find the nearest candidates based on a prediction.

frequencies - array or tuple-like containing the progressions

we expect to find

search - optional argument threshold for determining if something

is close enough

progressions - list of arrays corresponding to candidate progressions

pyspectools.spectra.analysis.line_weighting(frequency: float, catalog_frequency: float, intensity=None)[source]

Function for calculating the line weighting associated with each assignment candidate. The formula is based on intensity and frequency offset, such as to favor strong lines that are spot on over weak lines that are further away.

Parameters
  • frequency (float) – Center frequency in MHz; typically the u-line frequency.

  • catalog_frequency (float) – Catalog frequency of the candidate

  • intensity (float, optional) – log Intensity of the transition; includes the line strength and the temperature factor.

Returns

weighting – Associated weight value. Requires normalization

Return type

float

pyspectools.spectra.analysis.match_artifacts(on_exp, off_exp, thres=0.05, freq_col='Frequency')[source]

Function to remove a set of artifacts found in a blank spectrum.

Parameters
  • - AssignmentSession object (off_exp) – Experiment with the sample on; i.e. contains molecular features

  • - AssignmentSession object – Experiment with no sample; i.e. only artifacts

  • - float, optional (thres) – Threshold in absolute frequency units to match

  • - str, optional (freq_col) – Column specifying frequency in the pandas dataframes

Returns

Dictionary with keys corresponding to the uline index, and values the frequency

Return type

candidates - dict

pyspectools.spectra.analysis.peak_find(spec_df: pandas.core.frame.DataFrame, freq_col='Frequency', int_col='Intensity', thres=0.015, min_dist=10)[source]

Wrapper for peakutils applied to pandas dataframes. First finds the peak indices, which are then used to fit Gaussians to determine the center frequency for each peak.

Parameters
  • spec_df (dataframe) – Pandas dataframe containing the spectrum information, with columns corresponding to frequency and intensity.

  • freq_col (str, optional) – Name of the frequency column in spec_df

  • int_col (str, optional) – Name of the intensity column in spec_df

  • thres (float, optional) – Threshold for peak detection

Returns

Pandas dataframe containing the peaks frequency/intensity

Return type

peak_df

pyspectools.spectra.analysis.plotly_create_experiment_comparison(experiments, thres_prox=0.2, index=0, filepath=None, **kwargs)[source]

Function to create a plot comparing multiple experiments. This is a high level function that wraps the correlate_experiments function, and provides a visual and interactive view of the spectra output from this function using Plotly.

This function is effectively equivalent to bokeh_create_experiment_comparison, however uses Plotly as the front end instead.

Parameters
  • experiments

  • thres_prox

  • index

  • filepath

  • kwargs

pyspectools.spectra.analysis.search_center_frequency(frequency: float, width=0.5)[source]

Function for wrapping the astroquery Splatalogue API for looking up a frequency and finding candidate molecules for assignment. The width parameter adjusts the +/- range to include in the search: for high frequency surveys, it’s probably preferable to use a percentage to accommodate for the typically larger uncertainties (sub-mm experiments).

Parameters
  • frequency (float) – Frequency in MHz to search Splatalogue for.

  • width (float, optional) – Absolute frequency offset in MHz to include in the search.

Returns

Pandas dataframe containing frequency matches, or None if no matches are found.

Return type

dataframe or None

pyspectools.spectra.analysis.search_molecule(species: str, freq_range=[0.0, 40000.0], **kwargs)[source]

Function to search Splatalogue for a specific molecule. Technically I’d prefer to download entries from CDMS instead, but this is probably the most straight forward way.

The main use for this function is to verify line identifications - if a line is tentatively assigned to a U-line, then other transitions for the molecule that are stronger or comparatively strong should be visible.

Parameters
  • species (str) – Chemical name of the molecule

  • freq_range (list) – The frequency range to perform the lookup

Returns

Pandas dataframe containing transitions for the given molecule. If no matches are found, returns None.

Return type

DataFrame or None

pyspectools.spectra.assignment module

assignment module

This module contains three main classes for performing analysis of broad- band spectra. The AssignmentSession class will be what the user will mainly interact with, which will digest a spectrum, find peaks, make assignments and keep track of them, and generate the reports at the end.

To perform the assignments, the user can use the LineList class, which does the grunt work of homogenizing the different sources of frequency and molecular information: it is able to take SPCAT and .lin formats, as well as simply a list of frequencies. LineList then interacts with the AssignmentSession class, which handles the assignments.

The smallest building block in this procedure is the Transition class; every peak, every molecule transition, every artifact is considered as a Transition object. The LineList contains a list of Transition`s, and the peaks found by the `AssignmentSession are also kept as a LineList.

class pyspectools.spectra.assignment.AssignmentSession(exp_dataframe: pandas.core.frame.DataFrame, experiment: int, composition: List[str], temperature=4.0, velocity=0.0, freq_col='Frequency', int_col='Intensity', verbose=True, **kwargs)[source]

Bases: object

Main class for bookkeeping and analyzing broadband spectra. This class revolves around operating on a single continuous spectrum, using the class functions to automatically assess the noise statistics, find peaks, and do the bulk of the bookkeeping on what molecules are assigned to what peak.

add_ulines(data: List[Tuple[float, float]], **kwargs)[source]

Function to manually add multiple pairs of frequency/intensity to the current experiment’s Peaks list.

Kwargs are passed to the creation of the Transition object.

Parameters

data (iterable of 2-tuple) –

List-like of 2-tuples corresponding to frequency and intensity. Data should look like this example: [

(12345.213, 5.), (18623.125, 12.3)

]

analyze_molecule(Q=None, T=None, name=None, formula=None, smiles=None, chi_thres=10.0)[source]

Function for providing some astronomically relevant parameters by analyzing Gaussian line shapes.

Parameters
  • - float (chi_thres) – Partition function at temperature T

  • - float – Temperature in Kelvin

  • - str, optional (smiles) – Name of the molecule to perform the analysis on. Can be used as a selector.

  • - str, optional – Chemical formula of the molecule to perform the analysis on. Can be used as a selector.

  • - str, optional – SMILES code of the molecule to perform the analysis on. Can be used as a selector,

  • - float – Threshold for the Chi Squared value to consider fits for statistics. Any instances of fits with Chi squared values above this value will not be used to calculate line profile statistics.

Returns

First element is the profile dataframe, and second element is the fitted velocity. If a rotational temperature analysis is also performed, the third element will be the least-squares regression.

Return type

return_data - list

apply_filter(window: Union[str, List[str]], sigma=0.5, int_col=None)[source]

Applies a filter to the spectral signal. If multiple window functions are to be used, a list of windows can be provided, which will then perform the convolution stepwise. With the exception of the gaussian window function, the others functions use the SciPy signal functions.

A reference copy of the original signal is kept as the “Ref” column; this is used if a new window function is applied, rather than on the already convolved signal.

Parameters
  • window (str, or iterable of str) – Name of the window function

  • sigma (float, optional) – Specifies the magnitude of the gaussian blur. Only used when the window function asked for is “gaussian”.

  • int_col (None or str, optional) – Specifies which column to apply the window function to. If None, defaults to the session-wide intensity column

blank_spectrum(noise=0.0, noise_std=0.05, window=1.0)[source]

Blanks a spectrum based on the lines already previously assigned. The required arguments are the average and standard deviation of the noise, typically estimated by picking a region free of spectral features.

The spectra are sequentially blanked - online catalogs first, followed by literature species, finally the private assignments.

Parameters
  • - float (window) – Average noise value for the spectrum. Typically measured by choosing a region void of spectral lines.

  • - float – Standard deviation for the spectrum noise.

  • - float – Value to use for the range to blank. This region blanked corresponds to frequency+/-window.

calculate_assignment_statistics()[source]

Function for calculating some aggregate statistics of the assignments and u-lines. This breaks the assignments sources up to identify what the dominant source of information was. The two metrics for assignments are the number of transitions and the intensity contribution assigned by a particular source. :return: dict

clean_folder(action=False)[source]

Method for cleaning up all of the directories used by this routine. Use with caution!!!

Requires passing a True statement to actually clean up.

Parameters

action (bool) – If True, folders will be deleted. If False (default) nothing is done.

clean_spectral_assignments(window=1.0)[source]

Function to blank regions of the spectrum that have already been assigned. This function takes the frequencies of assignments, and uses the noise statistics to generate white noise to replace the peak. This is to let one focus on unidentified features, rather than be distracted by the assignments with large amplitudes.

Parameters

window (float, optional) – Frequency value in MHz to blank. The region corresponds to the frequency +/- this value.

copy_assignments(other: pyspectools.spectra.assignment.AssignmentSession, thres_prox=0.01)[source]

Function to copy assignments from another experiment. This class method wraps two analysis routines: first, correlations in detected peaks are found, and indexes of where correlations are found will be used to locate the corresponding Transition object, and copy its data over to the current experiment.

Parameters
  • other (AssignmentSession object) – The reference AssignmentSession object to copy assignments from

  • thres_prox (float, optional) – Threshold for considering coincidences between spectra.

create_full_dr_batch(cavity_freqs: List[float], filepath=None, shots=25, dipole=1.0, min_dist=500.0, atten=None, drpower=13)[source]

Create an FTB batch file for use in QtFTM to perform a DR experiment. A list of selected frequencies can be used as the cavity frequencies, which will subsequently be exhaustively DR’d against by ALL frequencies in the experiment.

The file is then saved to “ftb/XXX-full-dr.ftb”.

The atten parameter provides a more direct way to control RF power; if this value is used, it will overwrite the dipole moment setting.

Parameters
  • cavity_freqs (iterable of floats) – Iterable of frequencies to tune to, in MHz.

  • filepath (str, optional) – Path to save the ftb file to. Defaults to ftb/{}-dr.ftb

  • shots (int, optional) – Number of integration shots

  • dipole (float, optional) – Dipole moment used for attenuation setting

  • min_dist (float, optional) – Minimum frequency difference between cavity and DR frequency to actually perform the experiment

  • atten (None or int, optional) – Value to set the rf attenuation. By default, this is None, which will use the dipole moment instead to set the rf power. If a value is provided, it will overwrite whatever the dipole moment setting is.

create_latex_table(filepath=None, header=None, cols=None, **kwargs)[source]

Method to create a LaTeX table summarizing the measurements in this experiment.

Without any additional inputs, the table will be printed into a .tex file in the reports folder. The table will be created with the minimum amount of information required for a paper, including the frequency and intensity information, assignments, and the source of the information.

The user can override the default settings by supplying header and col arguments, and any other kwargs are passed into the to_latex pandas DataFrame method. The header and col lengths must match.

Parameters
  • filepath (str, optional) – Filepath to save the LaTeX table to; by default None

  • header (iterable of str, optional) – An iterable of strings specifying the header to be printed. By default None

  • cols (iterable of str, optional) – An iterable of strings specifying which columns to include. If this is changed, the header must also be changed to reflect the new columns.

create_uline_dr_batch(filepath=None, select=None, shots=25, dipole=1.0, min_dist=500.0, thres=None, atten=None, drpower=13)[source]

Create an FTB batch file for use in QtFTM to perform a DR experiment. A list of selected frequencies can be used as the cavity frequencies, which will subsequently be exhaustively DR’d against by all of the U-line frequencies remaining in this experiment.

The file is then saved to “ftb/XXX-dr.ftb”.

Parameters
  • filepath (str, optional) – Path to save the ftb file to. Defaults to ftb/{}-dr.ftb

  • select (list of floats, optional) – List of frequencies to use as cavity frequencies. Defaults to None, which will just DR every frequency against each other.

  • shots (int, optional) – Number of integration shots

  • dipole (float, optional) – Dipole moment used for attenuation setting

  • gap (float, optional) – Minimum frequency difference between cavity and DR frequency to actually perform the experiment

  • thres (None or float, optional) – Minimum value in absolute intensity units to consider in the DR batch. If None, this is ignored (default).

  • atten (None or int, optional) – Value to use for the attenuation, overwriting the dipole argument. This is useful for forcing cavity power in the high band.

create_uline_ftb_batch(filepath=None, shots=500, dipole=1.0, threshold=0.0, sort_int=False, atten=None)[source]

Create an FTB file for use in QtFTM based on the remaining ulines. This is used to provide cavity frequencies.

If a filepath is not specified, a -uline.ftb file will be created in the ftb folder.

The user has the ability to control parameters of the batch by setting a global shot count, dipole moment, and minimum intensity value for creation.

Parameters
  • filepath (str or None, optional) – Path to save the .ftb file to. If None, defaults to the session ID.

  • shots (int, optional) – Number of shots to integrate on each frequency

  • dipole (float, optional) – Dipole moment in Debye attenuation target for each frequency

  • threshold (float, optional) – Minimum value for the line intensity to be considered. For example, if the spectrum is analyzed in units of SNR, this would be the minimum value of SNR to consider in the FTB file.

  • sort_int (bool, optional) – If True, sorts the FTB entries in descending intensity order.

  • atten (None or int, optional) – Value to use for the attenuation, overwriting the dipole argument. This is useful for forcing cavity power in the high band.

create_ulinelist(filepath: str, silly=True)[source]

Create a LineList object for an unidentified molecule. This uses the class method umol_gen to automatically generate names for U-molecules which can then be renamed once it has been identified.

The session attribute umol_names also keeps track of filepaths to catalog names. If the filepath has been used previously, then it will raise an Exception noting that the filepath is already associated with another catalog.

Parameters
  • filepath (str) – File path to the catalog or .lin file to use as a reference

  • silly (bool, optional) – Flag whether to use boring numbered identifiers, or randomly generated AdjectiveAdjectiveAnimal.

Returns

Return type

LineList object

detect_noise_floor(region=None, als=True, **kwargs)[source]

Set the noise parameters for the current spectrum. Control over what “defines” the noise floor is specified with the parameter region. By default, if region is None then the function will perform an initial peak find using 1% of the maximum intensity as the threshold. The noise region will be established based on the largest gap between peaks, i.e. hopefully capturing as little features in the statistics as possible.

The alternative method is invoked when the als argument is set to True, which will use the asymmetric least-squares method to determine the baseline. Afterwards, the baseline is decimated by an extremely heavy Gaussian blur, and one ends up with a smoothly varying baseline. In this case, there is no noise_rms attribute to be returned as it is not required to determine the minimum peak threshold.

Parameters
  • region (2-tuple or None, optional) – If None, use the automatic algorithm. Otherwise, a 2-tuple specifies the region of the spectrum in frequency to use for noise statistics.

  • als (bool, optional) – If True, will use the asymmetric least squares method to determine the baseline.

  • kwargs – Additional kwargs are passed into the ALS function.

Returns

  • baseline - float – Value of the noise floor

  • rms - float – Noise RMS/standard deviation

df2ulines(dataframe: pandas.core.frame.DataFrame, freq_col=None, int_col=None)[source]

Add a dataframe of frequency and intensities to the session U-line dictionary. This function provides more manual control over what can be processed in the assignment pipeline, as not everything can be picked up by the peak finding algorithm.

Parameters
  • dataframe (pandas dataframe) – Dataframe containing a frequency and intensity column to add to the uline list.

  • freq_col (None or str) – Specify column to use for frequencies. If None, uses the session value freq_col.

  • int_col (None or str) – Specify column to use for intensities. If None, uses the session value int_col.

finalize_assignments()[source]

Function that will complete the assignment process by serializing DataClass objects and formatting a report.

Creates summary pandas dataframes as self.table and self.profiles, which correspond to the assignments and fitted line profiles respectively.

find_peaks(threshold=None, region=None, sigma=6, min_dist=10, als=True, **kwargs)[source]

Find peaks in the experiment spectrum, with a specified threshold value or automatic threshold. The method calls the peak_find function from the analysis module, which in itself wraps peakutils.

The function works by finding regions of the intensity where the first derivative goes to zero and changes sign. This gives peak frequency/intensities from the digitized spectrum, which is then “refined” by interpolating over each peak and fitting a Gaussian to determine the peak.

The peaks are then returned as a pandas DataFrame, which can also be accessed in the peaks_df attribute of AssignmentSession.

When a value of threshold is not provided, the function will turn to use automated methods for noise detection, either by taking a single value as the baseline (not ALS), or by using the asymmetric least-squares method for fitting the baseline. In both instances, the primary intensity column to be used for analysis will be changed to “SNR”, which is the recommended approach.

To use the ALS algorithm there may be some tweaking involved for the parameters. These are typically found empirically, but for reference here are some “optimal” values that have been tested.

For millimeter-wave spectra, larger values of lambda are favored:

lambda = 1e5 p = 0.1

This should get rid of periodic (fringe) baselines, and leave the “real” signal behind.

Parameters
  • threshold (float or None, optional) – Peak detection threshold. If None, will take 1.5 times the noise RMS.

  • region (2-tuple or None, optional) – If None, use the automatic algorithm. Otherwise, a 2-tuple specifies the region of the spectrum in frequency to use for noise statistics.

  • sigma (float, optional) – Defines the number of sigma (noise RMS) above the baseline to use as the peak detection threshold.

  • min_dist (int, optional) – Number of channels between peaks to be detected.

  • als (bool, optional) – If True, uses ALS fitting to determine a baseline.

  • kwargs – Additional keyword arguments are passed to the ALS fitting routine.

Returns

peaks_df – Pandas dataframe with Frequency/Intensity columns, corresponding to peaks

Return type

dataframe

find_progressions(search=0.001, low_B=400.0, high_B=9000.0, refit=False, plot=True, preferences=None, **kwargs)[source]

Performs a search for possible harmonically related U-lines. The first step loops over every possible U-line pair, and uses the difference to estimate an effective B value for predicting the next transition. If the search is successful, the U-line is added to the list. The search is repeated until there are no more U-lines within frequency range of the next predicted line.

Once the possible series are identified, the frequencies are fit to an effective linear molecule model (B and D terms). An affinity propa- gation cluster model is then used to group similar progressions toge- ther, with either a systematic test of preference values or a user specified value.

Parameters
  • search (float, optional) – Percentage value of the target frequency cutoff for excluding possible candidates in the harmonic search

  • high_B (low_B,) – Minimum and maximum value of B in MHz to be considered. This constrains the size of the molecule you are looking for.

  • refit (bool, optional) – If True, B and D are refit to the cluster frequencies.

  • plot (bool, optional) – If True, creates a Plotly scatter plot of the clusters, as a funct- ion of the preference values.

  • preferences (float or array_like of floats, optional) – A single value or an array of preference values for the AP cluster model. If None, the clustering will be performed on a default grid, where all of the results are returned.

  • kwargs (optional) – Additional kwargs are passed to the AP model initialization.

Returns

Return type

fig

classmethod from_ascii(filepath: str, experiment: int, composition=['C', 'H'], delimiter='\t', temperature=4.0, velocity=0.0, col_names=None, freq_col='Frequency', int_col='Intensity', skiprows=0, verbose=False, **kwargs)[source]

Class method for AssignmentSession to generate a session using an ASCII file. This is the preferred method for starting an AssignmentSession. The ASCII files are parsed using the pandas method read_csv, with the arguments for reading simply passed to that function.

Example based on blackchirp spectrum: The first row in an ASCII output from blackchirp contains the headers, which typically should be renamed to “Frequency, Intensity”. This can be done with this call:

``` session = AssignmentSession.from_ascii(

filepath=”ft1020.txt”, experiment=0, col_names=[“Frequency”, “Intensity”], skiprows=1 )

```

Example based on astronomical spectra: File formats are not homogenized, and delimiters may change. This exam- ple reads in a comma-separated spectrum, with a radial velocity of +26.2 km/s.

``` session = AssignmentSession.from_ascii(

filepath=”spectrum.mid.dat”, experiment=0, col_names=[“Frequency”, “Intensity”], velocity=26.2, delimiter=”,” )

```

Parameters
  • filepath (str) – Filepath to the ASCII spectrum

  • experiment (int) – Integer identifier for the experiment

  • composition (list of str, optional) – List of atomic symbols, representing the atomic composition of the experiment

  • delimiter (str, optional) – Delimiter character used in the ASCII file. For example, ” “, “s”, “,”

  • velocity (float, optional) – Radial velocity to offset the frequency in km/s.

  • temperature (float, optional) – Rotational temperature in Kelvin used for the experiment.

  • col_names (None or list of str, optional) – Names to rename the columns. If None, this is ignored.

  • freq_col (str, optional) – Name of the column to be used for the frequency axis

  • int_col (str, optional) – Name of the column to be used for the intensity axis

  • skip_rows (int, optional) – Number of rows to skip reading.

  • verbose (bool, optional) – If True, the logging module will also print statements and display any interaction that happens.

  • kwargs – Additional kwargs are passed onto initializing the Session class

Returns

Return type

AssignmentSession

classmethod load_session(filepath: str)[source]

Load an AssignmentSession from disk, once it has been saved with the save_session method which creates a pickle file.

Parameters

filepath (str) – path to the AssignmentSession pickle file; typically in the sessions/{experiment_id}.pkl

Returns

Instance of the AssignmentSession loaded from disk

Return type

AssignmentSession

match_artifacts(artifact_exp: pyspectools.spectra.assignment.AssignmentSession, threshold=0.05)[source]

TODO: Need to update this method; process_artifacts is no longer a method.

Remove artifacts based on another experiment which has the blank sample - i.e. only artifacts.

The routine will simple match peaks found in the artifact experiment, and assign all coincidences in the present experiment as artifacts.

Parameters
  • - AssignmentSession object (artifact_exp) – Experiment with no sample present

  • - float, optional (threshold) – Threshold in absolute frequency units for matching

overlay_molecule(species: str, freq_range=None, threshold=- 7.0)[source]

Function to query splatalogue for a specific molecule. By default, the frequency range that will be requested corresponds to the spectral range available in the experiment.

Parameters

species (str) – Identifier for a specific molecule, typically name

Returns

  • FigureWidget – Plotly FigureWidget that shows the experimental spectrum along with the detected peaks, and the molecule spectrum.

  • DataFrame – Pandas DataFrame from the Splatalogue query.

Raises

Exception – If no species are found in the query, raises Exception.

plot_assigned()[source]

Generates a Plotly figure with the assignments overlaid on the experimental spectrum.

Does not require any parameters, but requires that the assignments and peak finding functions have been run previously.

plot_breakdown()[source]

Generate two charts to summarize the breakdown of spectral features. The left column plot shows the number of ulines being assigned by the various sources of frequency data.

Artifacts - instrumental interference, from the function process_artifacts Splatalogue - uses the astroquery API, from the function splat_assign_spectrum Published - local catalogs, but with the public kwarg flagged as True Unpublished - local catalogs, but with the public kwarg flagged as False :return: Plotly FigureWidget object

plot_spectrum(simulate=False)[source]

Generates a Plotly figure of the spectrum. If U-lines are present, it will plot the simulated spectrum also.

process_clock_spurs(**kwargs)[source]

Method that will generate a LineList corresponding to possible harmonics, sum, and difference frequencies based on a given clock frequency (default: 65,000 MHz).

It is advised to run this function at the end of assignments, owing to the sheer number of possible combinations of lines, which may interfere with real molecular features.

Parameters

kwargs – Optional kwargs are passed into the creation of the LineList with LineList.from_clock.

process_db(auto=True, dbpath=None)[source]

Function for assigning peaks based on a local database file. The database is controlled with the SpectralCatalog class, which will handle all of the searching.

Parameters
  • auto (bool, optional) – If True, the assignments are made automatically.

  • dbpath (str or None, optional) – Filepath to the local database. If none is supplied, uses the default value from the user’s home directory.

process_linelist(name=None, formula=None, filepath=None, linelist=None, auto=True, thres=- 10.0, progressbar=True, tol=None, **kwargs)[source]

General purpose function for performing line assignments using local catalog and line data. The two main ways of running this function is to either provide a linelist or filepath argument. The type of linelist will be checked to determine how the catalog data will be processed: if it’s a string, it will be used to use

Parameters
  • name (str, optional) – Name of the molecule being assigned. This should be specified when providing a new line list, which then gets added to the experiment.

  • formula (str, optional) – Chemical formula for the molecule being assigned. Should be added in conjuction with name.

  • filepath (str, optional) – If a linelist is not given, a filepath can be specified corresponding to a .cat or .lin file, which will be used to create a LineList object.

  • linelist (str or LineList, optional) – Can be the name of a molecule or LineList object; the former is specified as a string which looks up the experiment line_list attribute for an existing LineList object. If a LineList object is provided, the function will use this directly.

  • auto (bool, optional) – Specifies whether the assignment procedure works without intervention. If False, the user will be prompted to provide a candidate index.

  • thres (float, optional) – log Intensity cut off used to screen candidates.

  • progressbar (bool, optional) – If True, a tqdm progressbar will indicate assignment progress.

  • tol (float, optional) – Tolerance for making assignments. If None, the function will default to the session-wide values of freq_abs and freq_prox to determine the tolerance.

  • kwargs – Kwargs are passed to the Transition object update when assignments are made.

process_linelist_batch(param_dict=None, yml_path=None, **kwargs)[source]

Function for processing a whole folder of catalog files. This takes a user-specified mapping scheme that will associate catalog files with molecule names, formulas, and any other LineList/Transition attributes. This can be in the form of a dictionary or a YAML file; one has to be provided.

An example scheme is given here: {

“cyclopentadiene”: {

“formula”: “c5h6”, “filepath”: “../data/catalogs/cyclopentadiene.cat”

}

} The top dictionary has keys corresponding to the name of the molecule, and the value as a sub dictionary containing the formula and filepath to the catalog file as minimum input.

You can also provide additional details that are Transition attributes: {

“benzene”: {

“formula”: “c6h6”, “filepath”: “../data/catalogs/benzene.cat”, “smiles”: “c1ccccc1”, “publc”: False

}

}

Parameters
  • param_dict (dict or None, optional) – If not None, a dictionary containing the mapping scheme will be used to process the catalogs. Defaults to None.

  • yml_path (str or None, optional) – If not None, corresponds to a str filepath to the YAML file to be read.

  • kwargs – Additional keyword arguments will be passed into the assignment process, which are the args for process_linelist.

:raises ValueError : If yml_path and param_dict args are the same value.:

process_splatalogue(auto=True, progressbar=True)[source]
Function that will provide an “interface” for interactive

line assignment in a notebook environment.

Basic functionality is looping over a series of peaks, which will query splatalogue for known transitions in the vicinity. If the line is known in Splatalogue, it will throw it into an Transition object and flag it as known. Conversely, if it’s not known in Splatalogue it will defer assignment, flagging it as unassigned and dumping it into the uline attribute.

Parameters

auto (bool) – If True the assignment process does not require user input, otherwise will prompt user.

rename_umolecule(name: str, new_name: str, formula='')[source]

Function to update the name of a LineList. This function should be used to update a LineList, particularly when the identity of an unidentified molecule is discovered.

Parameters
  • name (str) – Old name of the LineList.

  • new_name (str) – New name of the LineList - preferably, a real molecule name.

  • formula (str, optional) – New formula of the LineList.

save_session(filepath=None)[source]

Method to save an AssignmentSession to disk.

The underlying mechanics are based on the joblib library, and so there can be cross-compatibility issues particularly when loading from different versions of Python.

Parameters

- str (filepath) – Path to save the file to. By default it will go into the sessions folder.

search_frequency(frequency: float)[source]

Function for searching the experiment for a particular frequency. The search range is defined by the Session attribute freq_prox, and will first look for the frequency in the assigned features if any have been made. The routine will then look for it in the U-lines.

Parameters

frequency (float) – Center frequency in MHz

Returns

Pandas dataframe with the matches

Return type

dataframe

search_species(formula=None, name=None, smiles=None)[source]

Method for finding species in the assigned dataframe, with the intention of showing where the observed frequencies are.

Parameters
  • - str for chemical formula lookup (formula) –

  • - str for common name (name) –

  • - str for unique SMILES string (smiles) –

Returns

Return type

pandas dataframe slice with corresponding lookup

set_velocity(value: float)[source]

Set the radial velocity offset for the spectrum. The velocity is specified in km/s, and is set up such that the notation is positive velocity yields a redshifted spectrum (i.e. moving towards us).

This method should be used to change the velocity, as it will automatically re-calculate the dataframe frequency column to the new velocity.

Parameters

value (float) – Velocity in km/s

simulate_spectrum(x: numpy.ndarray, centers: List[float], widths: List[float], amplitudes: List[float], fake=False)[source]

Generate a synthetic spectrum with Gaussians with the specified parameters, on a given x axis.

GaussianModel is used here to remain internally consistent with the rest of the code.

x: array of x values to evaluate Gaussians on centers: array of Gaussian centers widths: array of Gaussian widths amplitudes: array of Gaussian amplitudes fake: bool indicating whether false intensities are used for the simulation

Return y

array of y values

simulate_sticks(catalogpath: str, N: float, Q: float, T: float, doppler=None, gaussian=False)[source]

Simulates a stick spectrum with intensities in flux units (Jy) for a given catalog file, the column density, and the rotational partition function at temperature T.

Parameters
  • catalogpath (str) – path to SPCAT catalog file

  • N (float) – column density in cm^-2

  • Q (float) – partition function at temperature T

  • T (float) – temperature in Kelvin

  • doppler (float, optional) – doppler width in km/s; defaults to session wide value

  • gaussian (bool, optional) – if True, simulates Gaussian profiles instead of sticks

  • Returns

  • -------

Returns

if gaussian is False, returns a dataframe with sticks; if True, returns a simulated Gaussian line profile spectrum

splat_assign_spectrum(auto=False)[source]

Alias for process_splatalogue. Function will be removed in a later version.

Parameters

auto (bool) – Specifies whether the assignment procedure is automatic.

stacked_plot(frequencies: List[float], int_col=None, freq_range=0.05)[source]

Special implementation of the stacked_plot from the figurefactory module, adapted for AssignmentSession. In this version, the assigned/u-lines are also indicated.

This function will generate a Plotly figure that stacks up the spectra as subplots, with increasing frequencies going up the plot. This function was written primarily to identify harmonically related lines, which in the absence of centrifugal distortion should line up perfectly in the center of the plot.

Due to limitations with Plotly, there is a maximum of ~8 plots that can stacked and will return an Exception if > 8 frequencies are provided.

frequencies: list of floats, corresponding to center frequencies freq_range: float percentage value of each center frequency to use as cutoffs

Returns

Plotly Figure object

umol_gen(silly=True)[source]

Generator for unidentified molecule names. Wraps :Yields: str – Formatted as “UMol_XXX”

update_database(dbpath=None)[source]

Adds all of the entries to a specified SpectralCatalog database. The database defaults to the global database stored in the home directory. This method will remove everything in the database associated with this experiment’s ID, and re-add the entries.

Parameters

dbpath (str, optional) – path to a SpectralCatalog database. Defaults to the system-wide catalog.

class pyspectools.spectra.assignment.LineList(name: str = '', formula: str = '', smi: str = '', filecontents: str = '', filepath: str = '', transitions: List = <factory>, frequencies: List[float] = <factory>, catalog_frequencies: List[float] = <factory>, source: str = '')[source]

Bases: object

Class for handling and homogenizing all of the possible line lists: from peaks to assignments to catalog files.

name

Name of the line list. Can be used to identify the molecule, or to simply state the purpose of the list.

Type

str, optional

formula

Chemical formula for the molecule, if applicable.

Type

str, optional

smi

SMILES representation of the molecule, if applicable.

Type

str, optional

filecontents

String representation of the file contents used to make the line list.

Type

str, optional

filepath

Path to the file used to make the list.

Type

str, optional

transitions

A designated list for holding Transition objects. This is the bulk of the information for a given line list.

Type

list, optional

add_uline(frequency: float, intensity: float, **kwargs)[source]

Function to manually add a U-line to the LineList. The function creates a Transition object with the frequency and intensity values provided by a user, which is then compared with the other transition entries within the LineList. If it doesn’t already exist, it will then add the new Transition to the LineList.

Kwargs are passed to the creation of the Transition object.

Parameters

intensity (frequency,) – Floats corresponding to the frequency and intensity of the line in a given unit.

add_ulines(data: List[Tuple[float, float]], **kwargs)[source]

Function to add multiple pairs of frequency/intensity to the current LineList.

Kwargs are passed to the creation of the Transition object.

Parameters

data (iterable of 2-tuple) –

List-like of 2-tuples corresponding to frequency and intensity. Data should look like this example: [

(12345.213, 5.), (18623.125, 12.3)

]

catalog_frequencies: List[float]
filecontents: str = ''
filepath: str = ''
find_candidates(frequency: float, lstate_threshold=4.0, freq_tol=0.1, int_tol=- 10.0, max_uncertainty=0.2)[source]

Function for searching the LineList for candidates. The first step uses pure Python to isolate transitions that meet three criteria: the lower state energy, the catalog intensity, and the frequency distance.

If no candidates are found, the function will return None. Otherwise, it will return the list of transitions and a list of associated normalized weights.

Parameters
  • frequency (float) – Frequency in MHz to try and match.

  • lstate_threshold (float, optional) – Lower state energy threshold in Kelvin

  • freq_tol (float, optional) – Frequency tolerance in MHz for matching two frequencies

  • int_tol (float, optional) – log Intensity threshold

Returns

If candidates are found, lists of the transitions and the associated weights are returned. Otherwise, returns None

Return type

transitions, weighting or None

find_nearest(frequency: float, tol=0.001)[source]

Look up transitions to find the nearest in frequency to the query. If the matched frequency is within a tolerance, then the function will return the corresponding Transition. Otherwise, it returns None.

Parameters
  • frequency (float) – Frequency in MHz to search for.

  • tol (float, optional) – Maximum tolerance for the deviation from the LineList frequency and query frequency

Returns

Return type

Transition object or None

formula: str = ''
frequencies: List[float]
classmethod from_artifacts(frequencies: List[float], **kwargs)[source]

Specialized class method for creating a LineList object specifically for artifacts/RFI. These Transitions are specially flagged as Artifacts.

Parameters
  • frequencies (iterable of floats) – List or array of floats corresponding to known artifact frequencies.

  • kwargs – Kwargs are passed into the Transition object creation.

Returns

Return type

LineList

classmethod from_catalog(name: str, formula: str, filepath: str, min_freq=0.0, max_freq=1000000000000.0, max_lstate=9000.0, **kwargs)[source]

Create a Line List object from an SPCAT catalog. :param name: Name of the molecule the catalog belongs to :type name: str :param formula: Chemical formula of the molecule :type formula: str :param filepath: Path to the catalog file. :type filepath: str :param min_freq: Minimum frequency in MHz for the frequency cutoff :type min_freq: float, optional :param max_freq: Maximum frequency in MHz for the frequency cutoff :type max_freq: float, optional :param max_lstate: Maximum lower state energy to filter out absurd lines :type max_lstate: float, optional :param kwargs: Additional attributes that are passed into the Transition objects. :type kwargs: optional

Returns

Instance of LineList with the digested catalog.

Return type

linelist_obj

classmethod from_clock(max_multi=64, clock=65000.0, **kwargs)[source]

Method of generating a LineList object by calculating all possible combinations of the

Parameters
  • max_multi (int, optional) – [description], by default 64

  • clock (float, optional) – Clock frequency to calculate sub-harmonics of, in units of MHz. Defaults to 65,000 MHz, which corresponds to the Keysight AWG

Returns

LineList object with the full list of possible clock spurs, as harmonics, sum, and difference frequencies.

Return type

LineList object

classmethod from_dataframe(dataframe: pandas.core.frame.DataFrame, name='Peaks', freq_col='Frequency', int_col='Intensity', **kwargs)[source]

Specialized class method for creating a LineList object from a Pandas Dataframe. This method is called by the AssignmentSession.df2ulines function to generate a Peaks LineList during peak detection.

Parameters
  • dataframe (pandas DataFrame) – DataFrame containing frequency and intensity information

  • freq_col (str, optional) – Name of the frequency column

  • int_col (str, optional) – Name of the intensity column

  • kwargs – Optional settings are passed into the creation of Transition objects.

Returns

Return type

LineList

classmethod from_lin(name: str, filepath: str, formula='', **kwargs)[source]

Generate a LineList object from a .lin file. This method should be used for intermediate assignments, when one does not know what the identity of a molecule is but has measured some frequency data.

Parameters
  • name (str) – Name of the molecule

  • filepath (str) – File path to the .lin file.

  • formula (str, optional) – Chemical formula of the molecule if known.

  • kwargs – Additional kwargs are passed into the Transition objects.

Returns

Return type

LineList

classmethod from_list(name: str, frequencies: List[float], formula='', **kwargs)[source]

Generic, low level method for creating a LineList object from a list of frequencies. This method can be used when neither lin, catalog, nor splatalogue is appropriate and you would like to manually create it by handpicked frequencies.

obj.uline == True,

Name of the species - doesn’t have to be its real name, just an identifier.

frequencies: list

A list of floats corresponding to the “catalog” frequencies.

formula: str, optional

Formula of the species, if known.

kwargs

Optional settings are passed into the creation of Transition objects.

Returns

Return type

LineList

classmethod from_pgopher(name: str, filepath: str, formula='', **kwargs)[source]

Method to take the output of a PGopher file and create a LineList object. The PGopher output must be in the comma delimited specification.

This is actually the ideal way to generate LineList objects: it fills in all of the relevant fields, such as linestrength and state energies.

Parameters
  • name (str) – Name of the molecule

  • filepath (str) – Path to the PGopher CSV output

  • formula (str, optional) – Chemical formula of the molecule, defaults to an empty string.

Returns

Return type

LineList

classmethod from_splatalogue_query(dataframe: pandas.core.frame.DataFrame, **kwargs)[source]

Method for converting a Splatalogue query dataframe into a LineList object. This is designed with the intention of pre-querying a set of molecules ahead of time, so that the user can have direct control over which molecules are specifically targeted without having to generate specific catalog files.

Parameters

dataframe (pandas DataFrame) – DataFrame generated by the function analysis.search_molecule

Returns

Return type

LineList

get_assignments()[source]

Function for retrieving assigned lines in a Line List.

Returns

assign_objs – List of all of the transition objects where the uline flag is set to False.

Return type

list

get_frequencies(numpy=False)[source]

Method to extract all the frequencies out of a LineList

Parameters

numpy (bool, optional) – If True, returns a NumPy ndarray with the frequencies.

Returns

List of transition frequencies

Return type

List or np.ndarray

get_multiple()[source]

Convenience function to extract all the transitions within a LineList that have multiple possible assignments.

Returns

List of Transition objects that have multiple assignments remaining.

Return type

List

get_ulines()[source]

Function for retrieving unidentified lines in a Line List.

Returns

uline_objs – List of all of the transition objects where the uline flag is set to True.

Return type

list

name: str = ''
smi: str = ''
source: str = ''
to_dataframe()[source]

Convert the transition data into a Pandas DataFrame. :returns: Pandas Dataframe with all of the transitions in the line list. :rtype: dataframe

to_ftb(filepath=None, thres=- 10.0, shots=500, dipole=1.0, **kwargs)[source]

Function to create an FTB file from a LineList object. This will create entries for every transition entry above a certain intensity threshold, in whatever units the intensities are in; i.e. SPCAT will be in log units, while experimental peaks will be in whatever arbitrary voltage scale.

Parameters
  • filepath (None or str, optional) – Path to write the ftb file to. If None (default), uses the name of the LineList and writes to the ftb folder.

  • thres (float, optional) – Threshold to cutoff transitions in the ftb file. Transitions with less intensity than this value not be included. Units are in the same units as whatever the LineList units are.

  • shots (int, optional) – Number of shots to integrate.

  • dipole (float, optional) – Target dipole moment for the species

  • kwargs – Additional kwargs are passed into the ftb creation, e.g. magnet, discharge, etc.

to_pickle(filepath=None)[source]

Function to serialize the LineList to a Pickle file. If no filepath is provided, the function will default to using the name attribute of the LineList to name the file.

Parameters

filepath (str or None, optional) – If None, uses name attribute for the filename, and saves to the linelists folder.

transitions: List
update_linelist(transition_objs: List[pyspectools.spectra.assignment.Transition])[source]

Adds transitions to a LineList if they do not exist in the list already.

Parameters

transition_objs (list) – List of Transition objects

update_transition(index: int, **kwargs)[source]

Function for updating a specific Transition object within the Line List.

Parameters
  • index (int) – Index for the list Transition object

  • kwargs (optional) – Updates to the Transition object

class pyspectools.spectra.assignment.Molecule(name: str = '', formula: str = '', smi: str = '', filecontents: str = '', filepath: str = '', transitions: List = <factory>, frequencies: List[float] = <factory>, catalog_frequencies: List[float] = <factory>, source: str = '', A: float = 20000.0, B: float = 6000.0, C: float = 3500.0, var_file: str = '')[source]

Bases: pyspectools.spectra.assignment.LineList

Special instance of the LineList class. The idea is to eventually use the high speed fitting/cataloguing routines by Brandon to provide quick simulations overlaid on chirp spectra.

Attributes

A: float = 20000.0
B: float = 6000.0
C: float = 3500.0
var_file: str = ''
class pyspectools.spectra.assignment.Session(experiment: int, composition: List[str] = <factory>, temperature: float = 4.0, doppler: float = 0.01, velocity: float = 0.0, freq_prox: float = 0.1, freq_abs: bool = True, baseline: float = 0.0, noise_rms: float = 0.0, noise_region: List[float] = <factory>, max_uncertainty: float = 0.2)[source]

Bases: object

Data class for handling parameters used for an AssignmentSession. The user generally shouldn’t need to directly interact with this class, but can give some level of dynamic control and bookkeeping to how and what molecules can be assigned, particularly with the composition, the frequency thresholds for matching, and the noise statistics.

experiment

ID for experiment

Type

int

composition

List of atomic symbols. Used for filtering out species in the Splatalogue assignment procedure.

Type

list of str

temperature

Temperature in K. Used for filtering transitions in the automated assigment, which are 3 times this value.

Type

float

doppler

Doppler width in km/s; default value is about 5 kHz at 15 GHz. Used for simulating lineshapes and for lineshape analysis.

Type

float

velocity

Radial velocity of the source in km/s; used to offset the frequency spectrum

Type

float

freq_prox

frequency cutoff for line assignments. If freq_abs attribute is True, this value is taken as the absolute value. Otherwise, it is a percentage of the frequency being compared.

Type

float

freq_abs

If True, freq_prox attribute is taken as the absolute value of frequency, otherwise as a decimal percentage of the frequency being compared.

Type

bool

baseline

Baseline level of signal used for intensity calculations and peak detection

Type

float

noise_rms

RMS of the noise used for intensity calculations and peak detection

Type

float

noise_region

The frequency region used to define the noise floor.

Type

2-tuple of floats

max_uncertainty

Value to use as the maximum uncertainty for considering a transition for assignments.

Type

float

baseline: float = 0.0
composition: List[str]
doppler: float = 0.01
experiment: int
freq_abs: bool = True
freq_prox: float = 0.1
max_uncertainty: float = 0.2
noise_region: List[float]
noise_rms: float = 0.0
temperature: float = 4.0
velocity: float = 0.0
class pyspectools.spectra.assignment.Transition(name: str = '', smiles: str = '', formula: str = '', frequency: float = 0.0, catalog_frequency: float = 0.0, catalog_intensity: float = 0.0, deviation: float = 0.0, intensity: float = 0.0, uncertainty: float = 0.0, S: float = 0.0, peak_id: int = 0, experiment: int = 0, uline: bool = True, composition: List[str] = <factory>, v_qnos: List[int] = <factory>, r_qnos: str = '', fit: Dict = <factory>, ustate_energy: float = 0.0, lstate_energy: float = 0.0, interference: bool = False, weighting: float = 0.0, source: str = 'Catalog', public: bool = True, velocity: float = 0.0, discharge: bool = False, magnet: bool = False, multiple: List[str] = <factory>, final: bool = False)[source]

Bases: object

DataClass for handling assignments. Attributes are assigned in order to be sufficiently informative for a line assignment to be unambiguous and reproduce it later in a form that is both machine and human readable.

name

IUPAC/common name; the former is preferred to be unambiguous

Type

str

formula

Chemical formula, or usually the stochiometry

Type

str

smiles

SMILES code that provides a machine and human readable chemical specification

Type

str

frequency

Observed frequency in MHz

Type

float

intensity

Observed intensity, in whatever units the experiments are in. Examples are Jy/beam, or micro volts.

Type

float

catalog_frequency

Catalog frequency in MHz

Type

float

catalog_intensity

Catalog line intensity, typically in SPCAT units

Type

float

S

Theoretical line strength; differs from the catalog line strength as it may be used for intrinsic line strength S u^2

Type

float

peak_id

Peak id from specific experiment

Type

int

uline

Flag to indicate whether line is identified or not

Type

bool

composition

A list of atomic symbols specifying what the experimental elemental composition is. Influences which molecules are considered possible in the Splatalogue assignment procedure.

Type

list of str

v_qnos

Quantum numbers for vibrational modes. Index corresponds to mode, and int value to number of quanta. Length should be equal to 3N-6.

Type

list of int

r_qnos

Rotational quantum numbers. TODO - better way of managing rotational quantum numbers

Type

str

experiment

Experiment ID to use as a prefix/suffix for record keeping

Type

int

weighting

Value for weighting factor used in the automated assignment

Type

float

fit

Contains the fitted parameters and model

Type

dict

ustate_energy

Energy of the upper state in Kelvin

Type

float

lstate_energy

Energy of the lower state in Kelvin

Type

float

intereference

Flag to indicate if this assignment is not molecular in nature

Type

bool

source

Indicates what the source used for this assignment is

Type

str

public

Flag to indicate if the information for this assignment is public/published

Type

bool

velocity

Velocity of the source used to make the assignment in km/s

Type

float

discharge

Whether or not the line is discharge dependent

Type

bool

magnet

Whether or not the line is magnet dependent (i.e. open shell)

Type

bool

S: float = 0.0
calc_intensity(Q: float, T=300.0)[source]

Convert linestrength into intensity.

Parameters
  • - float (T) – Partition function for the molecule at temperature T

  • - float – Temperature to calculate the intensity at in Kelvin

Returns

log10 of the intensity in SPCAT format

Return type

I - float

calc_linestrength(Q: float, T=300.0)[source]

Convert intensity into linestrength.

Parameters
  • - float (T) – Partition function for the molecule at temperature T

  • - float – Temperature to calculate the intensity at in Kelvin

Returns

intrinsic linestrength of the transition

Return type

intensity - float

catalog_frequency: float = 0.0
catalog_intensity: float = 0.0
check_molecule(other)[source]

Check equivalency based on a common carrier. Compares the name, formula, and smiles of this Transition object with another.

Returns

True if the two Transitions belong to the same carrier.

Return type

bool

choose_assignment(index: int)[source]

Function to manually pick an assignment from a list of multiple possible assignments found during process_linelist. After the new assignment is copied over, the final attribute is set to True and will no longer throw a warning duiring finalize_assignments.

Parameters

index (int) – Index of the candidate to use for the assignment.

composition: List[str]
deviation: float = 0.0
discharge: bool = False
experiment: int = 0
final: bool = False
fit: Dict
formula: str = ''
frequency: float = 0.0
classmethod from_dict(data_dict: Dict)[source]

Method for generating an Assignment object from a dictionary. All this method does is unpack a dictionary into the __init__ method.

Parameters

data_dict (dict) – Dictionary containing all of the Assignment DataClass fields that are to be populated.

Returns

Converted Assignment object from the input dictionary

Return type

Transition

classmethod from_json(json_path: str)[source]

Method for initializing an Assignment object from a JSON file.

Parameters

json_path (str) – Path to JSON file

Returns

Assignment object loaded from a JSON file.

Return type

Transition

classmethod from_yml(yaml_path: str)[source]

Method for initializing an Assignment object from a YAML file.

Parameters

yaml_path (str) – path to yaml file

Returns

Assignment object loaded from a YAML file.

Return type

Transition

get_spectrum(x: numpy.ndarray)[source]

Generate a synthetic peak by supplying the x axis for a particular spectrum. This method assumes that some fit parameters have been determined previously.

Parameters

x (Numpy 1D array) – Frequency bins from an experiment to simulate the line features.

Returns

Values of the model function spectrum at each particular value of x

Return type

Numpy 1D array

intensity: float = 0.0
interference: bool = False
lstate_energy: float = 0.0
magnet: bool = False
multiple: List[str]
name: str = ''
peak_id: int = 0
public: bool = True
r_qnos: str = ''
reset_assignment()[source]

Function to reset an assigned line into its original state. The only information that is kept regards to the frequency, intensity, and aspects about the experiment.

smiles: str = ''
source: str = 'Catalog'
to_file(filepath: str, format='yaml')[source]

Save an Transition object to disk with a specified file format. Defaults to YAML.

Parameters
  • filepath (str) – Path to yaml file

  • format (str, optional) – Denoting the syntax used for dumping. Defaults to YAML.

uline: bool = True
uncertainty: float = 0.0
ustate_energy: float = 0.0
v_qnos: List[int]
velocity: float = 0.0
weighting: float = 0.0

Module contents