Models¶
SearchResult¶
- class malva_client.models.SearchResult(raw_data, client)[source]¶
Bases:
MalvaDataFrameContainer for search results that extends MalvaDataFrame for direct analysis.
Data loading is lazy: if raw_data contains no ‘results’ (e.g. it is the initial POST response or a lightweight status response), the DataFrame is populated on first access to .df by fetching /api/expression-data/<job_id>/results from the server.
MalvaDataFrame¶
- class malva_client.models.MalvaDataFrame(expr_df, client, sample_metadata=None)[source]¶
Bases:
objectWrapper around pandas DataFrame with sample metadata enrichment and analysis methods
- property df: DataFrame¶
Access the underlying pandas DataFrame
- filter_by(**kwargs)[source]¶
Filter data by any combination of metadata fields (optimized for large datasets) Uses simplified field names (e.g., ‘organ’ instead of ‘specimen_from_organism.organ’)
- Parameters:
**kwargs – Field=value pairs for filtering
- Return type:
- Returns:
New MalvaDataFrame with filtered data
Example
df.filter_by(organ=’brain’, disease=’normal’, cell_type=’neuron’) df.filter_by(species=’Homo sapiens’, study=’BrainDepressiveDisorder’)
- aggregate_by(group_by, agg_func='mean', expr_column='rel')[source]¶
Aggregate expression data by specified grouping variables
- Parameters:
- Return type:
DataFrame- Returns:
DataFrame with aggregated results
Example
df.aggregate_by(‘cell_type’) df.aggregate_by([‘organ’, ‘cell_type’])
- plot_expression_by(group_by, limit=None, sort_by='mean', ascending=False, **kwargs)[source]¶
Plot expression levels grouped by a metadata field
- Parameters:
group_by (
str) – Column to group by for plottinglimit (
Optional[int]) – Maximum number of categories to show (shows top N)sort_by (
str) – How to sort categories (‘mean’, ‘median’, ‘count’, ‘alphabetical’)ascending (
bool) – Whether to sort in ascending order (False shows highest first)**kwargs – Additional arguments passed to matplotlib
- field_info()[source]¶
Get detailed information about available fields
- Return type:
DataFrame- Returns:
DataFrame with field names, types, unique values count, and examples
CellExpressionMatrixResult¶
- class malva_client.models.CellExpressionMatrixResult(archive_path=None, export_record=None, client=None, *, cells=None, features=None, matrix_entries=None, normalization_factors=None, sample_metadata=None, barcodes=None, job_id=None, source='direct')[source]¶
Bases:
objectPer-cell result returned by MalvaClient.retrieve_cells().
New results are built directly from search and metadata endpoints. The older ZIP-backed constructor remains supported for compatibility.
- __init__(archive_path=None, export_record=None, client=None, *, cells=None, features=None, matrix_entries=None, normalization_factors=None, sample_metadata=None, barcodes=None, job_id=None, source='direct')[source]¶
- property cells: DataFrame¶
row_index, sample_id, cell_id.
- Type:
Rows of the matrix
- property features: DataFrame¶
feature_index, job_id, feature, label, and source.
- Type:
Columns of the matrix
- property normalization_factors: DataFrame¶
Per-cell size factors aligned by row_index, when available.
- property sample_metadata: DataFrame¶
Sample metadata table keyed by sample_id, when available.
- property matrix_entries: DataFrame¶
Sparse matrix entries as row_index, feature_index, value.
Values are raw per-cell expression or k-mer hit counts. Missing row/feature pairs are zero.
- positive_cells(feature=None, sample_ids=None)[source]¶
Return cells with non-zero expression in any retrieved feature or one feature.
- to_dataframe(normalized=False, include_sample_metadata=False)[source]¶
Convert the sparse matrix to a long DataFrame.
- for_sample(sample_id, normalized=False, include_sample_metadata=False)[source]¶
Return long matrix entries for one encoded sample ID.
- Return type:
DataFrame
- to_single_cell_result(feature=None, sample_ids=None, normalized=False)[source]¶
Convert one retrieved feature to the legacy SingleCellResult shape.
This is useful for existing downstream code that expects columns cell_id, expression, and sample_id.
- Return type:
- project(dataset_id, sample_ids=None, feature=None, **kwargs)[source]¶
Project retrieved positive cells onto a coexpression index.
- Parameters:
dataset_id (
str) – Coexpression index or dataset identifier.sample_ids (
Union[int,List[int],None]) – Optional encoded sample ID or IDs to restrict cells.feature (
Union[str,int,None]) – Optional feature to restrict to cells positive for that feature. If omitted, uses all retrieved positive cells.**kwargs – Additional coexpression parameters, such as top_n_genes.
- Return type:
SingleCellResult¶
- class malva_client.models.SingleCellResult(results_data, client=None)[source]¶
Bases:
objectRepresents search results at the single cell level (not aggregated by cell type)
- to_dataframe()[source]¶
Convert results to a pandas DataFrame
- Returns:
cell_id, expression, sample_id
- Return type:
DataFrame with columns
- filter_by_expression(min_expression=0, max_expression=inf)[source]¶
Filter results by expression thresholds
- Parameters:
- Return type:
- Returns:
New SingleCellResult with filtered data
- filter_by_samples(sample_ids)[source]¶
Filter results to specific samples
- Parameters:
- Return type:
- Returns:
New SingleCellResult with filtered data
- get_top_expressing_cells(n=100)[source]¶
Get top N expressing cells
- Parameters:
n (
int) – Number of top cells to return- Return type:
- Returns:
New SingleCellResult with top expressing cells
- aggregate_by_sample()[source]¶
Aggregate expression data by sample
- Return type:
DataFrame- Returns:
DataFrame with sample-level statistics
CoverageResult¶
- class malva_client.models.CoverageResult(raw_data, client=None)[source]¶
Bases:
objectRepresents genomic coverage data from the Malva genome browser.
Coverage data is organized as a matrix with genomic positions as rows and cell types as columns. Each cell contains a coverage value.
- to_dataframe()[source]¶
Convert coverage data to a pandas DataFrame.
- Return type:
DataFrame- Returns:
DataFrame with positions as index and cell types as columns
CoexpressionResult¶
- class malva_client.models.CoexpressionResult(raw_data, client=None)[source]¶
Bases:
objectFull coexpression analysis result from the Malva coexpression API.
Wraps the response from
POST /api/coexpression/query-by-joband provides DataFrame conversions, top-gene retrieval, and plotting helpers for correlated genes, GO enrichment, and UMAP scores.- genes_to_dataframe()[source]¶
Convert correlated genes to a DataFrame.
- Returns:
gene, correlation, p_value (plus any extra fields returned by the server)
- Return type:
DataFrame with columns
- scores_to_dataframe()[source]¶
Convert UMAP scores to a DataFrame.
- Return type:
DataFrame- Returns:
DataFrame with metacell-level score data
- umap_to_dataframe()[source]¶
Convert UMAP score data to a DataFrame with x/y coordinates.
Falls back to
scores_to_dataframe()when coordinates are embedded in the scores payload.- Return type:
DataFrame- Returns:
DataFrame with UMAP coordinates and scores
- go_to_dataframe()[source]¶
Convert GO enrichment results to a DataFrame.
- Return type:
DataFrame- Returns:
DataFrame with columns such as go_id, name, fdr, etc.
- cell_type_enrichment_to_dataframe()[source]¶
Convert cell-type enrichment to a DataFrame.
- Return type:
DataFrame- Returns:
DataFrame with cell-type enrichment data
- tissue_breakdown_to_dataframe()[source]¶
Convert tissue breakdown to a DataFrame.
- Return type:
DataFrame- Returns:
DataFrame with tissue breakdown data
- plot_umap(color_by='positive_fraction', point_size=None, cmap='viridis', figsize=(10, 8))[source]¶
Scatter plot of UMAP coordinates coloured by a score column.
UMAPCoordinates¶
- class malva_client.models.UMAPCoordinates(raw_data, client=None)[source]¶
Bases:
objectLightweight container for UMAP coordinates from the coexpression API.
Wraps the compact parallel-array format returned by
GET /api/coexpression/umap/<dataset_id>and provides conversion to a pandas DataFrame and a simple scatter-plot method.- to_dataframe()[source]¶
Convert to a pandas DataFrame.
- Returns:
x, y, metacell_id, n_cells, sample, cluster
- Return type:
DataFrame with columns