Feature Engineering

Feature engineering is the process of taking a dataset and constructing explanatory variables — features — that can be used to train a machine learning model for a prediction problem. Often, data is spread across multiple tables and must be gathered into a single table with rows containing the observations and features in the columns.

The traditional approach to feature engineering is to build features one at a time using domain knowledge, a tedious, time-consuming, and error-prone process known as manual feature engineering. The code for manual feature engineering is problem-dependent and must be re-written for each new dataset.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 101–125 of 1706 papers

Title	Date	Tasks	Status	Hype	Score
Discovering Neural Wirings	Jun 3, 2019	Feature EngineeringNetwork Pruning	CodeCode Available	1	5
BP-Net: Efficient Deep Learning for Continuous Arterial Blood Pressure Estimation using Photoplethysmogram	Nov 29, 2021	Blood pressure estimationFeature Engineering	CodeCode Available	1	5
Can Models Help Us Create Better Models? Evaluating LLMs as Data Scientists	Oct 30, 2024	Feature Engineering	CodeCode Available	1	5
Can Q-Learning with Graph Networks Learn a Generalizable Branching Heuristic for a SAT Solver?	Dec 1, 2020	Feature EngineeringQ-Learning	CodeCode Available	1	5
CASPR: Customer Activity Sequence-based Prediction and Representation	Nov 16, 2022	Feature EngineeringPrediction	CodeCode Available	1	5
Cardea: An Open Automated Machine Learning Framework for Electronic Health Records	Oct 1, 2020	Automated Feature EngineeringAutoML	CodeCode Available	1	5
Deep Dive into Hunting for LotLs Using Machine Learning and Feature Engineering.	Apr 21, 2023	Feature Engineering	CodeCode Available	1	5
Relational Deep Learning: Graph Representation Learning on Relational Databases	Dec 7, 2023	Deep LearningFeature Engineering	CodeCode Available	1	5
Replay and Synthetic Speech Detection with Res2net Architecture	Oct 28, 2020	Feature EngineeringSynthetic Speech Detection	CodeCode Available	1	5
CheXbert: Combining Automatic Labelers and Expert Annotations for Accurate Radiology Report Labeling Using BERT	Apr 20, 2020	Feature Engineering	CodeCode Available	1	5
Clinical Temporal Relation Extraction with Probabilistic Soft Logic Regularization and Global Inference	Dec 16, 2020	Feature EngineeringMedical Question Answering	CodeCode Available	1	5
Compatible deep neural network framework with financial time series data, including data preprocessor, neural network model and trading strategy	May 11, 2022	Binary ClassificationFeature Engineering	CodeCode Available	1	5
CodeCMR: Cross-Modal Retrieval For Function-Level Binary Source Code Matching	Dec 1, 2020	Computer SecurityCross-Modal Retrieval	CodeCode Available	1	5
Cognitive Evolutionary Search to Select Feature Interactions for Click-Through Rate Prediction	Aug 1, 2023	Click-Through Rate PredictionEvolutionary Algorithms	CodeCode Available	1	5
Simplified DOM Trees for Transferable Attribute Extraction from the Web	Jan 7, 2021	AttributeAttribute Extraction	CodeCode Available	1	5
SimTeG: A Frustratingly Simple Approach Improves Textual Graph Learning	Aug 3, 2023	Feature EngineeringGraph Learning	CodeCode Available	1	5
Context-Aware Deep Learning for Multi Modal Depression Detection	Dec 26, 2024	Data AugmentationDeep Learning	CodeCode Available	1	5
SMUTF: Schema Matching Using Generative Tags and Hybrid Features	Jan 22, 2024	Feature EngineeringHumanitarian	CodeCode Available	1	5
Supervised Learning on Relational Databases with Graph Neural Networks	Feb 6, 2020	BIG-bench Machine LearningFeature Engineering	CodeCode Available	1	5
Symbolic regression for scientific discovery: an application to wind speed forecasting	Feb 21, 2021	Feature Engineeringregression	CodeCode Available	1	5
DeepSurv: Personalized Treatment Recommender System Using A Cox Proportional Hazards Deep Neural Network	Jun 2, 2016	Feature EngineeringPredicting Patient Outcomes	CodeCode Available	1	5
Synerise at RecSys 2021: Twitter user engagement prediction with a fast neural model	Sep 23, 2021	CPUFeature Engineering	CodeCode Available	1	5
The Remarkable Robustness of LLMs: Stages of Inference?	Jun 27, 2024	Feature EngineeringPrediction	CodeCode Available	1	5
DiviK: Divisive intelligent K-Means for hands-free unsupervised clustering in big biological data	Sep 22, 2020	ClusteringFeature Engineering	CodeCode Available	1	5
Modelling Context with User Embeddings for Sarcasm Detection in Social Media	Jul 4, 2016	Feature EngineeringSarcasm Detection	CodeCode Available	1	5

Show:10 25 50

← PrevPage 5 of 69Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	CNN	14 gestures accuracy	0.98	—	Unverified