Feature Engineering

Feature engineering is the process of taking a dataset and constructing explanatory variables — features — that can be used to train a machine learning model for a prediction problem. Often, data is spread across multiple tables and must be gathered into a single table with rows containing the observations and features in the columns.

The traditional approach to feature engineering is to build features one at a time using domain knowledge, a tedious, time-consuming, and error-prone process known as manual feature engineering. The code for manual feature engineering is problem-dependent and must be re-written for each new dataset.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 26–50 of 1706 papers

Title	Date	Tasks	Status	Hype
The Remarkable Robustness of LLMs: Stages of Inference?	Jun 27, 2024	Feature EngineeringPrediction	CodeCode Available	1
Optimized Feature Generation for Tabular Data via LLMs with Decision Tree Reasoning	Jun 12, 2024	Automated Feature EngineeringFeature Engineering	CodeCode Available	1
Network Analytics for Anti-Money Laundering -- A Systematic Literature Review and Experimental Evaluation	May 29, 2024	Feature EngineeringFraud Detection	CodeCode Available	1
Benchmarking Skeleton-based Motion Encoder Models for Clinical Applications: Estimating Parkinson's Disease Severity in Walking Sequences	May 28, 2024	BenchmarkingFeature Engineering	CodeCode Available	1
VCR-Graphormer: A Mini-batch Graph Transformer via Virtual Connections	Mar 24, 2024	Feature EngineeringGraph Learning	CodeCode Available	1
Retrieve, Merge, Predict: Augmenting Tables with Data Lakes	Feb 9, 2024	AutoMLBenchmarking	CodeCode Available	1
SMUTF: Schema Matching Using Generative Tags and Hybrid Features	Jan 22, 2024	Feature EngineeringHumanitarian	CodeCode Available	1
Dual Attention U-Net with Feature Infusion: Pushing the Boundaries of Multiclass Defect Segmentation	Dec 21, 2023	Edge DetectionFeature Engineering	CodeCode Available	1
Relational Deep Learning: Graph Representation Learning on Relational Databases	Dec 7, 2023	Deep LearningFeature Engineering	CodeCode Available	1
netFound: Foundation Model for Network Security	Oct 25, 2023	Feature Engineeringfeature selection	CodeCode Available	1
Blending gradient boosted trees and neural networks for point and probabilistic forecasting of hierarchical time series	Oct 19, 2023	DiversityFeature Engineering	CodeCode Available	1
FASER: Binary Code Similarity Search through the use of Intermediate Representations	Oct 5, 2023	Feature Engineering	CodeCode Available	1
Fine-Tuning Self-Supervised Learning Models for End-to-End Pronunciation Scoring	Sep 19, 2023	Feature EngineeringPhone-level pronunciation scoring	CodeCode Available	1
SimTeG: A Frustratingly Simple Approach Improves Textual Graph Learning	Aug 3, 2023	Feature EngineeringGraph Learning	CodeCode Available	1
Cognitive Evolutionary Search to Select Feature Interactions for Click-Through Rate Prediction	Aug 1, 2023	Click-Through Rate PredictionEvolutionary Algorithms	CodeCode Available	1
TimeTuner: Diagnosing Time Representations for Time-Series Forecasting with Counterfactual Explanations	Jul 19, 2023	counterfactualFeature Engineering	CodeCode Available	1
Benchmarks and Custom Package for Energy Forecasting	Jul 14, 2023	Feature EngineeringLoad Forecasting	CodeCode Available	1
Feature Programming for Multivariate Time Series Prediction	Jun 9, 2023	Automated Feature EngineeringFeature Engineering	CodeCode Available	1
An End-to-End Reinforcement Learning Approach for Job-Shop Scheduling Problems Based on Constraint Programming	Jun 9, 2023	Combinatorial OptimizationFeature Engineering	CodeCode Available	1
Large Language Models for Automated Data Science: Introducing CAAFE for Context-Aware Automated Feature Engineering	May 5, 2023	Automated Feature EngineeringAutoML	CodeCode Available	1
Deep Dive into Hunting for LotLs Using Machine Learning and Feature Engineering.	Apr 21, 2023	Feature Engineering	CodeCode Available	1
SkillGPT: a RESTful API service for skill extraction and standardization using a Large Language Model	Apr 17, 2023	Feature EngineeringLanguage Modeling	CodeCode Available	1
Bayesian Optimization of Catalysis With In-Context Learning	Apr 11, 2023	Bayesian OptimizationFeature Engineering	CodeCode Available	1
DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detection	Apr 1, 2023	Deep LearningFeature Engineering	CodeCode Available	1
DoE2Vec: Deep-learning Based Features for Exploratory Landscape Analysis	Mar 31, 2023	Deep LearningFeature Engineering	CodeCode Available	1

Show:10 25 50

← PrevPage 2 of 69Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	CNN	14 gestures accuracy	0.98	—	Unverified