Data & Analytics
Page 2 of 5
Browse skills in this category.
astropy
Data & Analyticsby K-Dense-AI
Comprehensive Python library for astronomy and astrophysics. This skill should be used when working with astronomical data including celestial coordinates, physical units, FITS files, cosmological calculations, time systems, tables, world coordinate systems (WCS), and astronomical data analysis. Use when tasks involve coordinate transformations, unit conversions, FITS file manipulation, cosmological distance calculations, time scale conversions, or astronomical data processing.
cosmic-database
Data & Analyticsby K-Dense-AI
Access COSMIC cancer mutation database. Query somatic mutations, Cancer Gene Census, mutational signatures, gene fusions, for cancer research and precision oncology. Requires authentication.
dask
Data & Analyticsby K-Dense-AI
Distributed computing for larger-than-RAM pandas/NumPy workflows. Use when you need to scale existing pandas/NumPy code beyond memory or across clusters. Best for parallel file processing, distributed ML, integration with existing pandas code. For out-of-core analytics on single machine use vaex; for in-memory speed use polars.
drugbank-database
Data & Analyticsby K-Dense-AI
Access and analyze comprehensive drug information from the DrugBank database including drug properties, interactions, targets, pathways, chemical structures, and pharmacology data. This skill should be used when working with pharmaceutical data, drug discovery research, pharmacology studies, drug-drug interaction analysis, target identification, chemical similarity searches, ADMET predictions, or any task requiring detailed drug and drug target information from DrugBank.
exploratory-data-analysis
Data & Analyticsby K-Dense-AI
Perform comprehensive exploratory data analysis on scientific data files across 200+ file formats. This skill should be used when analyzing any scientific data file to understand its structure, content, quality, and characteristics. Automatically detects file type and generates detailed markdown reports with format-specific analysis, quality metrics, and downstream analysis recommendations. Covers chemistry, bioinformatics, microscopy, spectroscopy, proteomics, metabolomics, and general scientific data formats.
geniml
Data & Analyticsby K-Dense-AI
This skill should be used when working with genomic interval data (BED files) for machine learning tasks. Use for training region embeddings (Region2Vec, BEDspace), single-cell ATAC-seq analysis (scEmbed), building consensus peaks (universes), or any ML-based analysis of genomic regions. Applies to BED file collections, scATAC-seq data, chromatin accessibility datasets, and region-based genomic feature learning.
geo-database
Data & Analyticsby K-Dense-AI
Access NCBI GEO for gene expression/genomics data. Search/download microarray and RNA-seq datasets (GSE, GSM, GPL), retrieve SOFT/Matrix files, for transcriptomics and expression analysis.
geopandas
Data & Analyticsby K-Dense-AI
Python library for working with geospatial vector data including shapefiles, GeoJSON, and GeoPackage files. Use when working with geographic data for spatial analysis, geometric operations, coordinate transformations, spatial joins, overlay operations, choropleth mapping, or any task involving reading/writing/analyzing vector geographic data. Supports PostGIS databases, interactive maps, and integration with matplotlib/folium/cartopy. Use for tasks like buffer analysis, spatial joins between datasets, dissolving boundaries, clipping data, calculating areas/distances, reprojecting coordinate systems, creating maps, or converting between spatial file formats.
hmdb-database
Data & Analyticsby K-Dense-AI
Access Human Metabolome Database (220K+ metabolites). Search by name/ID/structure, retrieve chemical properties, biomarker data, NMR/MS spectra, pathways, for metabolomics and identification.
lamindb
Data & Analyticsby K-Dense-AI
This skill should be used when working with LaminDB, an open-source data framework for biology that makes data queryable, traceable, reproducible, and FAIR. Use when managing biological datasets (scRNA-seq, spatial, flow cytometry, etc.), tracking computational workflows, curating and validating data with biological ontologies, building data lakehouses, or ensuring data lineage and reproducibility in biological research. Covers data management, annotation, ontologies (genes, cell types, diseases, tissues), schema validation, integrations with workflow managers (Nextflow, Snakemake) and MLOps platforms (W&B, MLflow), and deployment strategies.
matlab
Data & Analyticsby K-Dense-AI
MATLAB and GNU Octave numerical computing for matrix operations, data analysis, visualization, and scientific computing. Use when writing MATLAB/Octave scripts for linear algebra, signal processing, image processing, differential equations, optimization, statistics, or creating scientific visualizations. Also use when the user needs help with MATLAB syntax, functions, or wants to convert between MATLAB and Python code. Scripts can be executed with MATLAB or the open-source GNU Octave interpreter.
neurokit2
Data & Analyticsby K-Dense-AI
Comprehensive biosignal processing toolkit for analyzing physiological data including ECG, EEG, EDA, RSP, PPG, EMG, and EOG signals. Use this skill when processing cardiovascular signals, brain activity, electrodermal responses, respiratory patterns, muscle activity, or eye movements. Applicable for heart rate variability analysis, event-related potentials, complexity measures, autonomic nervous system assessment, psychophysiology research, and multi-modal physiological signal integration.
pyhealth
Data & Analyticsby K-Dense-AI
Comprehensive healthcare AI toolkit for developing, testing, and deploying machine learning models with clinical data. This skill should be used when working with electronic health records (EHR), clinical prediction tasks (mortality, readmission, drug recommendation), medical coding systems (ICD, NDC, ATC), physiological signals (EEG, ECG), healthcare datasets (MIMIC-III/IV, eICU, OMOP), or implementing deep learning models for healthcare applications (RETAIN, SafeDrug, Transformer, GNN).
pyopenms
Data & Analyticsby K-Dense-AI
Complete mass spectrometry analysis platform. Use for proteomics workflows feature detection, peptide identification, protein quantification, and complex LC-MS/MS pipelines. Supports extensive file formats and algorithms. Best for proteomics, comprehensive MS data processing. For simple spectral comparison and metabolite ID use matchms.
research-lookup
Data & Analyticsby K-Dense-AI
Look up current research information using Perplexity Sonar Pro Search or Sonar Reasoning Pro models through OpenRouter. Automatically selects the best model based on query complexity. Search academic papers, recent studies, technical documentation, and general research information with citations.
scientific-brainstorming
Data & Analyticsby K-Dense-AI
Creative research ideation and exploration. Use for open-ended brainstorming sessions, exploring interdisciplinary connections, challenging assumptions, or identifying research gaps. Best for early-stage research planning when you do not have specific observations yet. For formulating testable hypotheses from data use hypothesis-generation.
scientific-critical-thinking
Data & Analyticsby K-Dense-AI
Evaluate scientific claims and evidence quality. Use for assessing experimental design validity, identifying biases and confounders, applying evidence grading frameworks (GRADE, Cochrane Risk of Bias), or teaching critical analysis. Best for understanding evidence quality, identifying flaws. For formal peer review writing use peer-review.
scikit-bio
Data & Analyticsby K-Dense-AI
Biological data toolkit. Sequence analysis, alignments, phylogenetic trees, diversity metrics (alpha/beta, UniFrac), ordination (PCoA), PERMANOVA, FASTA/Newick I/O, for microbiome analysis.
scvi-tools
Data & Analyticsby K-Dense-AI
Deep generative models for single-cell omics. Use when you need probabilistic batch correction (scVI), transfer learning, differential expression with uncertainty, or multi-modal integration (TOTALVI, MultiVI). Best for advanced modeling, batch effects, multimodal data. For standard analysis pipelines use scanpy.
statistical-analysis
Data & Analyticsby K-Dense-AI
Guided statistical analysis with test selection and reporting. Use when you need help choosing appropriate tests for your data, assumption checking, power analysis, and APA-formatted results. Best for academic research reporting, test selection guidance. For implementing specific models programmatically use statsmodels.
torch-geometric
Data & Analyticsby K-Dense-AI
Graph Neural Networks (PyG). Node/graph classification, link prediction, GCN, GAT, GraphSAGE, heterogeneous graphs, molecular property prediction, for geometric deep learning.
umap-learn
Data & Analyticsby K-Dense-AI
UMAP dimensionality reduction. Fast nonlinear manifold learning for 2D/3D visualization, clustering preprocessing (HDBSCAN), supervised/parametric UMAP, for high-dimensional data.
vaex
Data & Analyticsby K-Dense-AI
Use this skill for processing and analyzing large tabular datasets (billions of rows) that exceed available RAM. Vaex excels at out-of-core DataFrame operations, lazy evaluation, fast aggregations, efficient visualization of big data, and machine learning on large datasets. Apply when users need to work with large CSV/HDF5/Arrow/Parquet files, perform fast statistics on massive datasets, create visualizations of big data, or build ML pipelines that do not fit in memory.
query-builder
Data & Analyticsby clidey
Convert natural language questions into SQL queries. Activates when users ask data questions in plain English like "show me users who signed up last week" or "find orders over $100".
schema-designer
Data & Analyticsby clidey
Help design database schemas, create tables, and plan data models. Activates when users ask to create tables, design schemas, or model data relationships.
hunt-analytics-generation
Data & Analyticsby OTRF
Generate query-agnostic analytics that model adversary behavior by translating hunt investigative intent into analytic definitions grounded in schema semantics. This skill is used to define how behavior should manifest in data before query execution or validation, and works best when informed by system internals, adversary tradecraft, a structured hunt focus, and suggested data sources.
hunt-blueprint-generation
Data & Analyticsby OTRF
Assemble a complete hunt blueprint by consolidating outputs from prior hunt planning skills into a single, structured plan for execution. Use this skill after system and tradecraft research, hunt focus definition, data source identification, and analytics generation have been completed. This skill is synthesis and packaging only and must not introduce new research, assumptions, or analytics.
hunt-data-source-identification
Data & Analyticsby OTRF
Identify relevant security data sources that could capture the behavior defined in a structured hunt hypothesis. Use this skill after the hunt focus has been defined to translate investigative intent into candidate telemetry sources using existing platform catalogs. This skill supports hunt planning by reasoning over available schemas and metadata before analytics development or query execution.
hunt-focus-definition
Data & Analyticsby OTRF
Define a focused hunt hypothesis by synthesizing completed system internals and adversary tradecraft research. Use this skill after research has been completed to narrow a high-level hunt topic into a single, concrete attack pattern with clear investigative intent. This skill produces a structured, testable hypothesis and should be used before selecting data sources, defining environment scope, or developing analytics.
thealgorithm
Data & AnalyticsUniversal execution engine using scientific method to achieve ideal state. USE WHEN complex tasks, multi-step work, "run the algorithm", "use the algorithm", OR any non-trivial request that benefits from structured execution with ISC (Ideal State Criteria) tracking.
creating-financial-models
Data & Analyticsby modelscope
This skill provides an advanced financial modeling suite with DCF analysis, sensitivity testing, Monte Carlo simulations, and scenario planning for investment decisions
infographic-creator
Data & Analyticsby antvis
Create beautiful infographics based on the given text content. Use this when users request creating infographics.
docetl
Data & Analyticsby ucbepic
Build and run LLM-powered data processing pipelines with DocETL. Use when users say "docetl", want to analyze unstructured data, process documents, extract information, or run ETL tasks on text. Helps with data collection, pipeline creation, execution, and optimization.
linux-production-shell-scripts
Data & Analyticsby zebbern
This skill should be used when the user asks to "create bash scripts", "automate Linux tasks", "monitor system resources", "backup files", "manage users", or "write production shell scripts". It provides ready-to-use shell script templates for system administration.
sql-injection-testing
Data & Analyticsby zebbern
This skill should be used when the user asks to "test for SQL injection vulnerabilities", "perform SQLi attacks", "bypass authentication using SQL injection", "extract database information through injection", "detect SQL injection flaws", or "exploit database query vulnerabilities". It provides comprehensive techniques for identifying, exploiting, and understanding SQL injection attack vectors across different database systems.
sqlmap-database-penetration-testing
Data & Analyticsby zebbern
This skill should be used when the user asks to "automate SQL injection testing," "enumerate database structure," "extract database credentials using sqlmap," "dump tables and columns from a vulnerable database," or "perform automated database penetration testing." It provides comprehensive guidance for using SQLMap to detect and exploit SQL injection vulnerabilities.
wireshark-network-traffic-analysis
Data & Analyticsby zebbern
This skill should be used when the user asks to "analyze network traffic with Wireshark", "capture packets for troubleshooting", "filter PCAP files", "follow TCP/UDP streams", "detect network anomalies", "investigate suspicious traffic", or "perform protocol analysis". It provides comprehensive techniques for network packet capture, filtering, and analysis using Wireshark.
leann-search
Data & Analyticsby parcadei
Semantic search across codebase using LEANN vector index
rudin-real-complex-analysis
Data & Analyticsby parcadei
Problem-solving with Rudin's Real and Complex Analysis textbook
06-database-script-management
Data & Analytics数据库脚本管理规范,涵盖 DDL/DML 脚本编写、版本命名规则、增量更新策略、数据迁移、回滚方案。当用户编写数据库脚本、新增表结构、修改字段、进行数据迁移或管理 SQL 版本时使用。
23-database-sharding
Data & Analytics数据库分片指南,涵盖分片策略设计、分片键选择、跨分片查询、数据迁移、分片路由规则。当用户设计数据库分片、选择分片键、处理跨分片查询或进行分片数据迁移时使用。
29-4-process-dao-database
Data & AnalyticsProcess 模块 DAO 层与数据库表结构详细分析,涵盖 JOOQ 使用、表结构设计、索引优化、数据分片。当用户开发 Process 数据访问、设计表结构、优化查询性能或处理数据存储时使用。
44-database-design
Data & AnalyticsBK-CI 数据库设计规范与表结构指南,涵盖命名规范、字段类型选择、索引设计、分表策略、数据归档。当用户设计数据库表、优化索引、规划分表策略或进行数据库架构设计时使用。
obsidian-bases
Data & Analyticsby davepoon
Create and edit Obsidian Bases (.base files) with views, filters, formulas, and summaries. Use when working with .base files, creating database-like views of notes, or when the user mentions Bases, table views, card views, filters, or formulas in Obsidian.
server-components
Data & Analyticsby davepoon
This skill should be used when the user asks about "Server Components", "Client Components", "'use client' directive", "when to use server vs client", "RSC patterns", "component composition", "data fetching in components", or needs guidance on React Server Components architecture in Next.js.
postgis-skill
Data & Analyticsby postgis
PostGIS-focused SQL tips, tricks and gotchas. Use when in need of dealing with geospatial data in Postgres.
scientific-critical-thinking
Data & Analyticsby sonofmagic
Evaluate research rigor. Assess methodology, experimental design, statistical validity, biases, confounding, evidence quality (GRADE, Cochrane ROB), for critical analysis of scientific claims.
disk-usage
Data & AnalyticsAnalyze disk space usage and filesystem information including mounts, usage, and large files
network-info
Data & AnalyticsGather network configuration and connectivity information including interfaces, routes, and DNS
n8n-workflow-patterns
Data & Analyticsby czlonkowski
Proven workflow architectural patterns from real n8n workflows. Use when building new workflows, designing workflow structure, choosing workflow patterns, planning workflow architecture, or asking about webhook processing, HTTP API integration, database operations, AI agent workflows, or scheduled tasks.
design-postgres-tables
Data & Analyticsby timescale
Comprehensive PostgreSQL-specific table design reference covering data types, indexing, constraints, performance patterns, and advanced features
find-hypertable-candidates
Data & Analyticsby timescale
Analyze an existing PostgreSQL database to identify tables that would benefit from conversion to TimescaleDB hypertables
migrate-postgres-tables-to-hypertables
Data & Analyticsby timescale
Comprehensive guide for migrating PostgreSQL tables to TimescaleDB hypertables with optimal configuration and performance validation
setup-timescaledb-hypertables
Data & Analyticsby timescale
Step-by-step instructions for designing table schemas and setting up TimescaleDB with hypertables, indexes, compression, retention policies, and continuous aggregates. Instructions for selecting: partition columns, segment_by columns, order_by columns, chunk time interval, real-time aggregation.
kotlin-expert
Data & AnalyticsAdvanced Kotlin patterns for AmethystMultiplatform. Flow state management (StateFlow/SharedFlow), sealed hierarchies (classes vs interfaces), immutability (@Immutable, data classes), DSL builders (type-safe fluent APIs), inline functions (reified generics, performance). Use when working with: (1) State management patterns (StateFlow/SharedFlow/MutableStateFlow), (2) Sealed classes or sealed interfaces, (3) @Immutable annotations for Compose, (4) DSL builders with lambda receivers, (5) inline/reified functions, (6) Kotlin performance optimization. Complements kotlin-coroutines agent (async patterns) - this skill focuses on Amethyst-specific Kotlin idioms.
databases
Data & Analyticsby mrgoonie
Work with MongoDB (document database, BSON documents, aggregation pipelines, Atlas cloud) and PostgreSQL (relational database, SQL queries, psql CLI, pgAdmin). Use when designing database schemas, writing queries and aggregations, optimizing indexes for performance, performing database migrations, configuring replication and sharding, implementing backup and restore strategies, managing database users and permissions, analyzing query performance, or administering production databases.
sequential-thinking
Data & Analyticsby mrgoonie
Use when complex problems require systematic step-by-step reasoning with ability to revise thoughts, branch into alternative approaches, or dynamically adjust scope. Ideal for multi-stage analysis, design planning, problem decomposition, or tasks with initially unclear scope.
simplification-cascades
Data & Analyticsby mrgoonie
Find one insight that eliminates multiple components - "if this is true, we don't need X, Y, or Z
optimizing-performance
Data & Analyticsby CloudAI-X
Analyzes and optimizes application performance across frontend, backend, and database layers. Use when diagnosing slowness, improving load times, optimizing queries, reducing bundle size, or when asked about performance issues.
executive-briefing
Data & Analyticsby anthropics
Transforms research findings into executive-ready briefings. Automatically activated when user mentions 'executive', 'briefing', 'C-suite', 'board', 'leadership', or 'presentation'.
performing-causal-analysis
Data & Analyticsby pymc-labs
Fits causal models, estimates impacts, and plots results using CausalPy. Use when performing analysis with DiD, ITS, SC, or RD.
running-placebo-analysis
Data & Analyticsby pymc-labs
Performs placebo-in-time sensitivity analysis to validate causal claims. Use when checking model robustness, verifying lack of pre-intervention effects, or ensuring observed effects are not spurious.
working-with-marimo
Data & Analyticsby pymc-labs
Interactive development in marimo notebooks with validation loops. Use for creating/editing marimo notebooks and verifying execution.
insforge-schema-patterns
Data & Analyticsby InsForge
Database schema patterns for InsForge including social graphs, e-commerce, content publishing, and multi-tenancy with RLS policies. Use when designing data models with relationships, foreign keys, or Row Level Security.
github-archive
Data & Analyticsby gadievron
Investigate GitHub security incidents using tamper-proof GitHub Archive data via BigQuery. Use when verifying repository activity claims, recovering deleted PRs/branches/tags/repos, attributing actions to actors, or reconstructing attack timelines. Provides immutable forensic evidence of all public GitHub events since 2011.
aggregating-performance-metrics
Data & AnalyticsAggregate and centralize performance metrics from applications, systems, databases, caches, and services. Use when consolidating monitoring data from multiple sources. Trigger with phrases like "aggregate metrics", "centralize monitoring", or "collect performance data".