Feature Engineering

Feature engineering is the process of taking a dataset and constructing explanatory variables — features — that can be used to train a machine learning model for a prediction problem. Often, data is spread across multiple tables and must be gathered into a single table with rows containing the observations and features in the columns.

The traditional approach to feature engineering is to build features one at a time using domain knowledge, a tedious, time-consuming, and error-prone process known as manual feature engineering. The code for manual feature engineering is problem-dependent and must be re-written for each new dataset.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 501–550 of 1706 papers

Title	Date	Tasks	Status
Deep Interaction Machine: A Simple but Effective Model for High-order Feature Interactions	Jan 1, 2020	Click-Through Rate PredictionFeature Engineering	—Unverified
ADSAGE: Anomaly Detection in Sequences of Attributed Graph Edges applied to insider threat detection at fine-grained level	Jul 14, 2020	Anomaly DetectionFeature Engineering	—Unverified
A Brief Survey of Machine Learning Methods for Emotion Prediction using Physiological Data	Jan 17, 2022	BIG-bench Machine LearningFeature Engineering	—Unverified
Decoding and interpreting cortical signals with a compact convolutional neural network	Mar 2, 2021	Brain DecodingEEG	—Unverified
A Survey on Data Collection for Machine Learning: a Big Data -- AI Integration Perspective	Nov 8, 2018	BIG-bench Machine LearningFeature Engineering	—Unverified
Deep Learning based, end-to-end metaphor detection in Greek language with Recurrent and Convolutional Neural Networks	Jul 23, 2020	Feature EngineeringRepresentation Learning	—Unverified
Decision Trees That Remember: Gradient-Based Learning of Recurrent Decision Trees with Memory	Feb 6, 2025	Feature EngineeringState Space Models	—Unverified
Decision Tree Based Wrappers for Hearing Loss	Feb 12, 2025	Feature Engineeringfeature selection	—Unverified
A Survey on Data-Centric AI: Tabular Learning from Reinforcement Learning and Generative AI Perspective	Feb 12, 2025	Feature Engineeringfeature selection	—Unverified
Amrita_CEN at SemEval-2022 Task 6: A Machine Learning Approach for Detecting Intended Sarcasm using Oversampling	Jul 1, 2022	Feature Engineeringregression	—Unverified
Deceptive Review Spam Detection via Exploiting Task Relatedness and Unlabeled Data	Nov 1, 2016	Feature EngineeringGeneral Classification	—Unverified
A Survey on Churn Analysis	Oct 25, 2020	Feature EngineeringManagement	—Unverified
DataStories at SemEval-2017 Task 6: Siamese LSTM with Attention for Humorous Text Comparison	Aug 1, 2017	Feature EngineeringHumor Detection	—Unverified
A Survey on Arabic Named Entity Recognition: Past, Recent Advances, and Future Trends	Feb 7, 2023	Feature EngineeringLanguage Modeling	—Unverified
Amrita_CEN at SemEval-2022 Task 4: Oversampling-based Machine Learning Approach for Detecting Patronizing and Condescending Language	Jul 1, 2022	Feature Engineeringregression	—Unverified
Data Smashing 2.0: Sequence Likelihood (SL) Divergence For Fast Time Series Comparison	Sep 26, 2019	Feature EngineeringTime Series	—Unverified
Dataset-Agnostic Recommender Systems	Jan 13, 2025	Feature Engineeringfeature selection	—Unverified
Dataiku's Solution to SPHERE's Activity Recognition Challenge	Oct 10, 2016	Activity RecognitionBIG-bench Machine Learning	—Unverified
Data-driven Smart Ponzi Scheme Detection	Aug 20, 2021	Dynamic graph embeddingFeature Engineering	—Unverified
Data-Driven Investigative Journalism For Connectas Dataset	Apr 23, 2018	BIG-bench Machine LearningFeature Engineering	—Unverified
Enhancing Sindhi Word Segmentation using Subword Representation Learning and Position-aware Self-attention	Dec 30, 2020	Feature EngineeringPosition	—Unverified
Data-driven intelligent computational design for products: Method, techniques, and applications	Jan 29, 2023	Feature EngineeringRetrieval	—Unverified
Data Collection and Quality Challenges in Deep Learning: A Data-Centric AI Perspective	Dec 13, 2021	BIG-bench Machine LearningFairness	—Unverified
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code	Mar 8, 2023	Feature Engineering	—Unverified
A Model of Coherence Based on Distributed Sentence Representation	Oct 1, 2014	Feature EngineeringSentence	—Unverified
A Defensive Framework Against Adversarial Attacks on Machine Learning-Based Network Intrusion Detection Systems	Feb 21, 2025	Ensemble LearningFeature Engineering	—Unverified
DAG-based Long Short-Term Memory for Neural Word Segmentation	Jul 2, 2017	Chinese Word SegmentationFeature Engineering	—Unverified
Customer Support Ticket Escalation Prediction using Feature Engineering	Oct 10, 2020	Feature EngineeringManagement	—Unverified
A strong baseline for question relevancy ranking	Aug 27, 2018	Community Question AnsweringFeature Engineering	—Unverified
AMEIR: Automatic Behavior Modeling, Interaction Exploration and MLP Investigation in the Recommender System	Jun 10, 2020	Feature EngineeringNeural Architecture Search	—Unverified
Customers Churn Prediction in Financial Institution Using Artificial Neural Network	Dec 23, 2019	Feature Engineeringfeature selection	—Unverified
Customer Lifetime Value in Video Games Using Deep Learning and Parametric Models	Nov 28, 2018	Feature Engineering	—Unverified
A streamable large-scale clinical EEG dataset for Deep Learning	Mar 4, 2022	Deep LearningEEG	—Unverified
Cuffless Blood Pressure Estimation from Electrocardiogram and Photoplethysmogram Using Waveform Based ANN-LSTM Network	Nov 6, 2018	Blood pressure estimationFeature Engineering	—Unverified
CTSys at SemEval-2018 Task 3: Irony in Tweets	Jun 1, 2018	Feature EngineeringGeneral Classification	—Unverified
A State-of-the-Art Mention-Pair Model for Coreference Resolution	Jun 1, 2015	coreference-resolutionCoreference Resolution	—Unverified
AMC-Net: An Effective Network for Automatic Modulation Classification	Apr 2, 2023	ClassificationDenoising	—Unverified
A Deep Representation Empowered Distant Supervision Paradigm for Clinical Information Extraction	Apr 20, 2018	BIG-bench Machine LearningFeature Engineering	—Unverified
Democratizing AI: Non-expert design of prediction tasks	Feb 14, 2018	Feature EngineeringPrediction	—Unverified
A Stacking Gated Neural Architecture for Implicit Discourse Relation Classification	Nov 1, 2016	Feature EngineeringGeneral Classification	—Unverified
Cross-lingual Short-text Matching with Deep Learning	Nov 13, 2018	Deep LearningFeature Engineering	—Unverified
AssistedDS: Benchmarking How External Domain Knowledge Assists LLMs in Automated Data Science	May 25, 2025	BenchmarkingFeature Engineering	—Unverified
A machine learning model for identifying cyclic alternating patterns in the sleeping brain	Apr 23, 2018	BIG-bench Machine LearningEEG	—Unverified
Cross-Lingual Induction and Transfer of Verb Classes Based on Word Vector Space Specialisation	Jul 21, 2017	ClusteringCross-Lingual Transfer	—Unverified
Cross-Class Relevance Learning for Temporal Concept Localization	Nov 19, 2019	Feature EngineeringVideo Understanding	—Unverified
Assets Forecasting with Feature Engineering and Transformation Methods for LightGBM	Dec 27, 2024	Feature EngineeringFeature Importance	—Unverified
Credit card fraud detection using machine learning: A survey	Oct 13, 2020	BIG-bench Machine LearningFeature Engineering	—Unverified
Coupled IGMM-GANs for deep multimodal anomaly detection in human mobility data	Sep 8, 2018	Anomaly DetectionFeature Engineering	—Unverified
Asset price movement prediction using empirical mode decomposition and Gaussian mixture models	Mar 26, 2025	Ensemble LearningFeature Engineering	—Unverified
A Machine Learning Approach to Digital Contact Tracing: TC4TL Challenge	Mar 8, 2022	BIG-bench Machine LearningFeature Engineering	—Unverified

Show:10 25 50

← PrevPage 11 of 35Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	CNN	14 gestures accuracy	0.98	—	Unverified