Feature Engineering

Feature engineering is the process of taking a dataset and constructing explanatory variables — features — that can be used to train a machine learning model for a prediction problem. Often, data is spread across multiple tables and must be gathered into a single table with rows containing the observations and features in the columns.

The traditional approach to feature engineering is to build features one at a time using domain knowledge, a tedious, time-consuming, and error-prone process known as manual feature engineering. The code for manual feature engineering is problem-dependent and must be re-written for each new dataset.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 151–175 of 1706 papers

Title	Date	Tasks	Status	Hype
See it, Think it, Sorted: Large Multimodal Models are Few-shot Time Series Anomaly Analyzers	Nov 4, 2024	Anomaly DetectionFeature Engineering	—Unverified	0
Explainable cognitive decline detection in free dialogues with a Machine Learning approach based on pre-trained Large Language Models	Nov 4, 2024	Feature EngineeringPrompt Engineering	—Unverified	0
Enhancing Glucose Level Prediction of ICU Patients through Hierarchical Modeling of Irregular Time-Series	Nov 3, 2024	Data IntegrationFeature Engineering	CodeCode Available	0
Enriching Tabular Data with Contextual LLM Embeddings: A Comprehensive Ablation Study for Ensemble Classifiers	Nov 3, 2024	Ensemble LearningFeature Engineering	—Unverified	0
Machine Learning Framework for Audio-Based Content Evaluation using MFCC, Chroma, Spectral Contrast, and Temporal Feature Engineering	Oct 31, 2024	Feature Engineering	—Unverified	0
Can Models Help Us Create Better Models? Evaluating LLMs as Data Scientists	Oct 30, 2024	Feature Engineering	CodeCode Available	1
AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions	Oct 27, 2024	Feature Engineering	CodeCode Available	3
Predicting 30-Day Hospital Readmission in Medicare Patients: Insights from an LSTM Deep Learning Model	Oct 23, 2024	Feature EngineeringReadmission Prediction	—Unverified	0
Large Language Models Engineer Too Many Simple Features For Tabular Data	Oct 23, 2024	Feature EngineeringText Generation	CodeCode Available	0
AdaptoML-UX: An Adaptive User-centered GUI-based AutoML Toolkit for Non-AI Experts and HCI Researchers	Oct 22, 2024	Automated Feature EngineeringAutoML	CodeCode Available	0
Molecular Topological Profile (MOLTOP) - Simple and Strong Baseline for Molecular Graph Classification	Oct 17, 2024	Feature EngineeringGraph Classification	CodeCode Available	0
Reproducible Machine Learning-based Voice Pathology Detection: Introducing the Pitch Difference Feature	Oct 14, 2024	Feature EngineeringVoice pathology detection	CodeCode Available	0
Statistical Test for Auto Feature Engineering by Selective Inference	Oct 13, 2024	Feature Engineering	CodeCode Available	0
ELF-Gym: Evaluating Large Language Models Generated Features for Tabular Prediction	Oct 13, 2024	Feature Engineering	CodeCode Available	0
Sui Generis: Large Language Models for Authorship Attribution and Verification in Latin	Oct 11, 2024	Authorship AttributionAuthorship Verification	—Unverified	0
Towards Trustworthy Web Attack Detection: An Uncertainty-Aware Ensemble Deep Kernel Learning Model	Oct 10, 2024	Ensemble LearningFeature Engineering	—Unverified	0
Principal Orthogonal Latent Components Analysis (POLCA Net)	Oct 9, 2024	Dimensionality ReductionFeature Correlation	CodeCode Available	0
Neural-Bayesian Program Learning for Few-shot Dialogue Intent Parsing	Oct 8, 2024	Feature EngineeringFew-Shot Learning	—Unverified	0
Learning to Solve Abstract Reasoning Problems with Neurosymbolic Program Synthesis and Task Generation	Oct 6, 2024	Feature EngineeringProgram Synthesis	—Unverified	0
Self-eXplainable AI for Medical Image Analysis: A Survey and New Outlooks	Oct 3, 2024	counterfactualCounterfactual Explanation	—Unverified	0
Semantic-Guided RL for Interpretable Feature Engineering	Oct 3, 2024	Automated Feature EngineeringDeep Reinforcement Learning	—Unverified	0
Enhancing End Stage Renal Disease Outcome Prediction: A Multi-Sourced Data-Driven Approach	Oct 2, 2024	Data IntegrationFeature Engineering	—Unverified	0
Automatic deductive coding in discourse analysis: an application of large language models in learning analytics	Oct 2, 2024	Feature EngineeringLanguage Modeling	CodeCode Available	0
LML-DAP: Language Model Learning a Dataset for Data-Augmented Prediction	Sep 27, 2024	ClassificationFeature Engineering	CodeCode Available	1
Enhanced Convolution Neural Network with Optimized Pooling and Hyperparameter Tuning for Network Intrusion Detection	Sep 27, 2024	Attention Score PredictionFeature Engineering	CodeCode Available	0

Show:10 25 50

← PrevPage 7 of 69Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	CNN	14 gestures accuracy	0.98	—	Unverified