Feature Engineering

Feature engineering is the process of taking a dataset and constructing explanatory variables — features — that can be used to train a machine learning model for a prediction problem. Often, data is spread across multiple tables and must be gathered into a single table with rows containing the observations and features in the columns.

The traditional approach to feature engineering is to build features one at a time using domain knowledge, a tedious, time-consuming, and error-prone process known as manual feature engineering. The code for manual feature engineering is problem-dependent and must be re-written for each new dataset.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–75 of 1706 papers

Title	Date	Tasks	Status	Hype
Feature Engineering on LMS Data to Optimize Student Performance Prediction	Apr 3, 2025	Feature EngineeringManagement	—Unverified	0
Unleashing the Power of Pre-trained Encoders for Universal Adversarial Attack Detection	Apr 1, 2025	Adversarial AttackAdversarial Attack Detection	—Unverified	0
SPIO: Ensemble and Selective Strategies via LLM-Based Multi-Agent Planning in Automated Data Science	Mar 30, 2025	Feature Engineering	—Unverified	0
FeRG-LLM : Feature Engineering by Reason Generation Large Language Models	Mar 30, 2025	Feature EngineeringLarge Language Model	—Unverified	0
RocketPPA: Code-Level Power, Performance, and Area Prediction via LLM and Mixture of Experts	Mar 27, 2025	Code RepairFeature Engineering	—Unverified	0
Embedding Domain-Specific Knowledge from LLMs into the Feature Engineering Pipeline	Mar 27, 2025	Feature Engineeringfeature selection	—Unverified	0
Feature-Enhanced Machine Learning for All-Cause Mortality Prediction in Healthcare Data	Mar 27, 2025	AllFeature Engineering	—Unverified	0
Asset price movement prediction using empirical mode decomposition and Gaussian mixture models	Mar 26, 2025	Ensemble LearningFeature Engineering	—Unverified	0
Machine Learning - Driven Materials Discovery: Unlocking Next-Generation Functional Materials -- A minireview	Mar 22, 2025	AutoMLBayesian Optimization	—Unverified	0
NeuralFoil: An Airfoil Aerodynamics Analysis Tool Using Physics-Informed Machine Learning	Mar 20, 2025	Feature EngineeringPhysics-informed machine learning	CodeCode Available	3
LLM-FE: Automated Feature Engineering for Tabular Data with LLMs as Evolutionary Optimizers	Mar 18, 2025	Automated Feature EngineeringFeature Engineering	CodeCode Available	2
CoDet-M4: Detecting Machine-Generated Code in Multi-Lingual, Multi-Generator and Multi-Domain Settings	Mar 17, 2025	Code GenerationEthics	—Unverified	0
Applications of Large Language Model Reasoning in Feature Generation	Mar 15, 2025	Computational EfficiencyDomain Adaptation	—Unverified	0
VORTEX: Challenging CNNs at Texture Recognition by using Vision Transformers with Orderless and Randomized Token Encodings	Mar 9, 2025	Computational EfficiencyFeature Engineering	CodeCode Available	0
Exploring LLM Agents for Cleaning Tabular Machine Learning Datasets	Mar 9, 2025	Data IntegrationFeature Engineering	—Unverified	0
Bridging the Semantic Gap in Virtual Machine Introspection and Forensic Memory Analysis	Mar 7, 2025	Feature Engineering	—Unverified	0
YARE-GAN: Yet Another Resting State EEG-GAN	Mar 4, 2025	EEGFeature Engineering	CodeCode Available	0
Efficient or Powerful? Trade-offs Between Machine Learning and Deep Learning for Mental Illness Detection on Social Media	Mar 3, 2025	Computational EfficiencyFeature Engineering	—Unverified	0
Integrating convolutional layers and biformer network with forward-forward and backpropagation training	Feb 28, 2025	Computational chemistryDrug Discovery	CodeCode Available	0
Improving Representation Learning of Complex Critical Care Data with ICU-BERT	Feb 26, 2025	Feature EngineeringLanguage Modeling	—Unverified	0
Edge Training and Inference with Analog ReRAM Technology for Hand Gesture Recognition	Feb 25, 2025	Feature EngineeringGesture Recognition	—Unverified	0
Mitigating Attrition: Data-Driven Approach Using Machine Learning and Data Engineering	Feb 25, 2025	Feature Engineering	—Unverified	0
TabulaTime: A Novel Multimodal Deep Learning Framework for Advancing Acute Coronary Syndrome Prediction through Environmental and Clinical Data Integration	Feb 24, 2025	Data IntegrationFeature Engineering	—Unverified	0
ML-Driven Approaches to Combat Medicare Fraud: Advances in Class Imbalance Solutions, Feature Engineering, Adaptive Learning, and Business Impact	Feb 21, 2025	DiagnosticDimensionality Reduction	—Unverified	0
A Defensive Framework Against Adversarial Attacks on Machine Learning-Based Network Intrusion Detection Systems	Feb 21, 2025	Ensemble LearningFeature Engineering	—Unverified	0

Show:10 25 50

← PrevPage 3 of 69Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	CNN	14 gestures accuracy	0.98	—	Unverified