Feature Engineering

Feature engineering is the process of taking a dataset and constructing explanatory variables — features — that can be used to train a machine learning model for a prediction problem. Often, data is spread across multiple tables and must be gathered into a single table with rows containing the observations and features in the columns.

The traditional approach to feature engineering is to build features one at a time using domain knowledge, a tedious, time-consuming, and error-prone process known as manual feature engineering. The code for manual feature engineering is problem-dependent and must be re-written for each new dataset.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–100 of 1706 papers

Title	Date	Tasks	Status	Hype
Feature Engineering on LMS Data to Optimize Student Performance Prediction	Apr 3, 2025	Feature EngineeringManagement	—Unverified	0
Unleashing the Power of Pre-trained Encoders for Universal Adversarial Attack Detection	Apr 1, 2025	Adversarial AttackAdversarial Attack Detection	—Unverified	0
SPIO: Ensemble and Selective Strategies via LLM-Based Multi-Agent Planning in Automated Data Science	Mar 30, 2025	Feature Engineering	—Unverified	0
FeRG-LLM : Feature Engineering by Reason Generation Large Language Models	Mar 30, 2025	Feature EngineeringLarge Language Model	—Unverified	0
RocketPPA: Code-Level Power, Performance, and Area Prediction via LLM and Mixture of Experts	Mar 27, 2025	Code RepairFeature Engineering	—Unverified	0
Embedding Domain-Specific Knowledge from LLMs into the Feature Engineering Pipeline	Mar 27, 2025	Feature Engineeringfeature selection	—Unverified	0
Feature-Enhanced Machine Learning for All-Cause Mortality Prediction in Healthcare Data	Mar 27, 2025	AllFeature Engineering	—Unverified	0
Asset price movement prediction using empirical mode decomposition and Gaussian mixture models	Mar 26, 2025	Ensemble LearningFeature Engineering	—Unverified	0
Machine Learning - Driven Materials Discovery: Unlocking Next-Generation Functional Materials -- A minireview	Mar 22, 2025	AutoMLBayesian Optimization	—Unverified	0
NeuralFoil: An Airfoil Aerodynamics Analysis Tool Using Physics-Informed Machine Learning	Mar 20, 2025	Feature EngineeringPhysics-informed machine learning	CodeCode Available	3
LLM-FE: Automated Feature Engineering for Tabular Data with LLMs as Evolutionary Optimizers	Mar 18, 2025	Automated Feature EngineeringFeature Engineering	CodeCode Available	2
CoDet-M4: Detecting Machine-Generated Code in Multi-Lingual, Multi-Generator and Multi-Domain Settings	Mar 17, 2025	Code GenerationEthics	—Unverified	0
Applications of Large Language Model Reasoning in Feature Generation	Mar 15, 2025	Computational EfficiencyDomain Adaptation	—Unverified	0
VORTEX: Challenging CNNs at Texture Recognition by using Vision Transformers with Orderless and Randomized Token Encodings	Mar 9, 2025	Computational EfficiencyFeature Engineering	CodeCode Available	0
Exploring LLM Agents for Cleaning Tabular Machine Learning Datasets	Mar 9, 2025	Data IntegrationFeature Engineering	—Unverified	0
Bridging the Semantic Gap in Virtual Machine Introspection and Forensic Memory Analysis	Mar 7, 2025	Feature Engineering	—Unverified	0
YARE-GAN: Yet Another Resting State EEG-GAN	Mar 4, 2025	EEGFeature Engineering	CodeCode Available	0
Efficient or Powerful? Trade-offs Between Machine Learning and Deep Learning for Mental Illness Detection on Social Media	Mar 3, 2025	Computational EfficiencyFeature Engineering	—Unverified	0
Integrating convolutional layers and biformer network with forward-forward and backpropagation training	Feb 28, 2025	Computational chemistryDrug Discovery	CodeCode Available	0
Improving Representation Learning of Complex Critical Care Data with ICU-BERT	Feb 26, 2025	Feature EngineeringLanguage Modeling	—Unverified	0
Mitigating Attrition: Data-Driven Approach Using Machine Learning and Data Engineering	Feb 25, 2025	Feature Engineering	—Unverified	0
Edge Training and Inference with Analog ReRAM Technology for Hand Gesture Recognition	Feb 25, 2025	Feature EngineeringGesture Recognition	—Unverified	0
TabulaTime: A Novel Multimodal Deep Learning Framework for Advancing Acute Coronary Syndrome Prediction through Environmental and Clinical Data Integration	Feb 24, 2025	Data IntegrationFeature Engineering	—Unverified	0
ML-Driven Approaches to Combat Medicare Fraud: Advances in Class Imbalance Solutions, Feature Engineering, Adaptive Learning, and Business Impact	Feb 21, 2025	DiagnosticDimensionality Reduction	—Unverified	0
A Defensive Framework Against Adversarial Attacks on Machine Learning-Based Network Intrusion Detection Systems	Feb 21, 2025	Ensemble LearningFeature Engineering	—Unverified	0
Feature Engineering Approach to Building Load Prediction: A Case Study for Commercial Building Chiller Plant Optimization in Tropical Weather	Feb 17, 2025	ClusteringFeature Engineering	—Unverified	0
EssayJudge: A Multi-Granular Benchmark for Assessing Automated Essay Scoring Capabilities of Multimodal Large Language Models	Feb 17, 2025	Automated Essay ScoringFeature Engineering	—Unverified	0
SEM-CLIP: Precise Few-Shot Learning for Nanoscale Defect Detection in Scanning Electron Microscope Image	Feb 15, 2025	Defect DetectionFeature Engineering	—Unverified	0
PainDECOG: Machine Learning-Based Identification of Pain Biomarkers from sEEG Signals	Feb 15, 2025	Feature EngineeringManagement	—Unverified	0
Recent Advances in Malware Detection: Graph Learning and Explainability	Feb 14, 2025	Feature EngineeringGraph Embedding	—Unverified	0
Chronic Diseases Prediction Using ML	Feb 13, 2025	Feature EngineeringPrediction	—Unverified	0
LLM4GNAS: A Large Language Model Based Toolkit for Graph Neural Architecture Search	Feb 12, 2025	Feature EngineeringGraph Learning	—Unverified	0
A Survey on Data-Centric AI: Tabular Learning from Reinforcement Learning and Generative AI Perspective	Feb 12, 2025	Feature Engineeringfeature selection	—Unverified	0
Decision Tree Based Wrappers for Hearing Loss	Feb 12, 2025	Feature Engineeringfeature selection	—Unverified	0
Exploring Patterns Behind Sports	Feb 11, 2025	Computational EfficiencyFeature Engineering	—Unverified	0
Enhancing Physics-Informed Neural Networks Through Feature Engineering	Feb 11, 2025	Feature Engineering	—Unverified	0
Application of quantum machine learning using quantum kernel algorithms on multiclass neuron M type classification	Feb 10, 2025	Binary ClassificationFeature Engineering	—Unverified	0
Agentic AI Systems Applied to tasks in Financial Services: Modeling and model risk management crews	Feb 8, 2025	Decision MakingFeature Engineering	—Unverified	0
Decision Trees That Remember: Gradient-Based Learning of Recurrent Decision Trees with Memory	Feb 6, 2025	Feature EngineeringState Space Models	—Unverified	0
From Features to Transformers: Redefining Ranking for Scalable Impact	Feb 5, 2025	DiversityFeature Engineering	—Unverified	0
Benchmarking Time Series Forecasting Models: From Statistical Techniques to Foundation Models in Real-World Applications	Feb 5, 2025	BenchmarkingFeature Engineering	—Unverified	0
Year-over-Year Developments in Financial Fraud Detection via Deep Learning: A Systematic Literature Review	Jan 31, 2025	Deep LearningFeature Engineering	—Unverified	0
CAAT-EHR: Cross-Attentional Autoregressive Transformer for Multimodal Electronic Health Record Embeddings	Jan 31, 2025	Feature Engineering	CodeCode Available	0
RAINER: A Robust Ensemble Learning Grid Search-Tuned Framework for Rainfall Patterns Prediction	Jan 28, 2025	Dimensionality ReductionEnsemble Learning	—Unverified	0
360Brew: A Decoder-only Foundation Model for Personalized Ranking and Recommendation	Jan 27, 2025	DecoderFeature Engineering	—Unverified	0
Sample-Efficient Behavior Cloning Using General Domain Knowledge	Jan 27, 2025	Car RacingFeature Engineering	—Unverified	0
A Transferable Physics-Informed Framework for Battery Degradation Diagnosis, Knee-Onset Detection and Knee Prediction	Jan 24, 2025	Feature EngineeringOnset Detection	—Unverified	0
EvoGP: A GPU-accelerated Framework for Tree-based Genetic Programming	Jan 21, 2025	Feature EngineeringGPU	CodeCode Available	7
Distributed Multi-Head Learning Systems for Power Consumption Prediction	Jan 21, 2025	Diversityenergy management	—Unverified	0
DLinear-based Prediction of Remaining Useful Life of Lithium-Ion Batteries: Feature Engineering through Explainable Artificial Intelligence	Jan 20, 2025	Explainable artificial intelligenceFeature Engineering	—Unverified	0

Show:10 25 50

← PrevPage 2 of 35Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	CNN	14 gestures accuracy	0.98	—	Unverified