SOTAVerified

Feature Engineering

Feature engineering is the process of taking a dataset and constructing explanatory variables — features — that can be used to train a machine learning model for a prediction problem. Often, data is spread across multiple tables and must be gathered into a single table with rows containing the observations and features in the columns.

The traditional approach to feature engineering is to build features one at a time using domain knowledge, a tedious, time-consuming, and error-prone process known as manual feature engineering. The code for manual feature engineering is problem-dependent and must be re-written for each new dataset.

Papers

Showing 251300 of 1706 papers

TitleStatusHype
F-RBA: A Federated Learning-based Framework for Risk-based Authentication0
Feature engineering vs. deep learning for paper section identification: Toward applications in Chinese medical literature0
A Progressive Transformer for Unifying Binary Code Embedding and Knowledge Transfer0
Deep Learning-Based Noninvasive Screening of Type 2 Diabetes with Chest X-ray Images and Electronic Health RecordsCode0
Modeling Story Expectations to Understand Engagement: A Generative Framework Using LLMs0
Vision Transformers for Efficient Indoor Pathloss Radio Map Prediction0
Image-Based Malware Classification Using QR and Aztec Codes0
Robust Feature Engineering Techniques for Designing Efficient Motor Imagery-Based BCI-Systems0
RUL forecasting for wind turbine predictive maintenance based on deep learning0
Parkinson's Disease Diagnosis Through Deep Learning: A Novel LSTM-Based Approach for Freezing of Gait Detection0
PRECISE: Pre-training Sequential Recommenders with Collaborative and Semantic Information0
Federated Automated Feature Engineering0
Deep Learning in Single-Cell and Spatial Transcriptomics Data Analysis: Advances and Challenges from a Data Science Perspective0
Comparative Performance of Machine Learning Algorithms for Early Genetic Disorder and Subclass Classification0
Intelligent Spark Agents: A Modular LangGraph Framework for Scalable, Visualized, and Enhanced Big Data Machine Learning Workflows0
HiCat: A Semi-Supervised Approach for Cell Type Annotation0
An AutoML-based approach for Network Intrusion Detection0
Enhancing Molecular Design through Graph-based Topological Reinforcement Learning0
Understanding LLM Embeddings for Regression0
Advancing Heatwave Forecasting via Distribution Informed-Graph Neural Networks (DI-GNNs): Integrating Extreme Value Theory with GNNs0
Is Precise Recovery Necessary? A Task-Oriented Imputation Approach for Time Series Forecasting on Variable Subset0
What makes a good BIM design: quantitative linking between design behavior and quality0
GPTree: Towards Explainable Decision-Making via LLM-powered Decision Trees0
Large Language Models for Constructing and Optimizing Machine Learning Workflows: A SurveyCode0
Classification of residential and non-residential buildings based on satellite data using deep learning0
RAGulator: Lightweight Out-of-Context Detectors for Grounded Text Generation0
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level0
Correlation of Object Detection Performance with Visual Saliency and Depth EstimationCode0
Explainable cognitive decline detection in free dialogues with a Machine Learning approach based on pre-trained Large Language Models0
Exploring Feature Importance and Explainability Towards Enhanced ML-Based DoS Detection in AI Systems0
See it, Think it, Sorted: Large Multimodal Models are Few-shot Time Series Anomaly Analyzers0
Enriching Tabular Data with Contextual LLM Embeddings: A Comprehensive Ablation Study for Ensemble Classifiers0
Enhancing Glucose Level Prediction of ICU Patients through Hierarchical Modeling of Irregular Time-SeriesCode0
Machine Learning Framework for Audio-Based Content Evaluation using MFCC, Chroma, Spectral Contrast, and Temporal Feature Engineering0
Large Language Models Engineer Too Many Simple Features For Tabular DataCode0
Predicting 30-Day Hospital Readmission in Medicare Patients: Insights from an LSTM Deep Learning Model0
AdaptoML-UX: An Adaptive User-centered GUI-based AutoML Toolkit for Non-AI Experts and HCI ResearchersCode0
Molecular Topological Profile (MOLTOP) - Simple and Strong Baseline for Molecular Graph ClassificationCode0
Reproducible Machine Learning-based Voice Pathology Detection: Introducing the Pitch Difference FeatureCode0
ELF-Gym: Evaluating Large Language Models Generated Features for Tabular PredictionCode0
Statistical Test for Auto Feature Engineering by Selective InferenceCode0
Sui Generis: Large Language Models for Authorship Attribution and Verification in Latin0
Towards Trustworthy Web Attack Detection: An Uncertainty-Aware Ensemble Deep Kernel Learning Model0
Principal Orthogonal Latent Components Analysis (POLCA Net)Code0
Neural-Bayesian Program Learning for Few-shot Dialogue Intent Parsing0
Learning to Solve Abstract Reasoning Problems with Neurosymbolic Program Synthesis and Task Generation0
Self-eXplainable AI for Medical Image Analysis: A Survey and New Outlooks0
Semantic-Guided RL for Interpretable Feature Engineering0
Enhancing End Stage Renal Disease Outcome Prediction: A Multi-Sourced Data-Driven Approach0
Automatic deductive coding in discourse analysis: an application of large language models in learning analyticsCode0
Show:102550
← PrevPage 6 of 35Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1CNN14 gestures accuracy0.98Unverified