Feature Engineering

Feature engineering is the process of taking a dataset and constructing explanatory variables — features — that can be used to train a machine learning model for a prediction problem. Often, data is spread across multiple tables and must be gathered into a single table with rows containing the observations and features in the columns.

The traditional approach to feature engineering is to build features one at a time using domain knowledge, a tedious, time-consuming, and error-prone process known as manual feature engineering. The code for manual feature engineering is problem-dependent and must be re-written for each new dataset.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 251–300 of 1706 papers

Title	Date	Tasks	Status
F-RBA: A Federated Learning-based Framework for Risk-based Authentication	Dec 16, 2024	Anomaly DetectionFeature Engineering	—Unverified
Feature engineering vs. deep learning for paper section identification: Toward applications in Chinese medical literature	Dec 15, 2024	Deep LearningFeature Engineering	—Unverified
A Progressive Transformer for Unifying Binary Code Embedding and Knowledge Transfer	Dec 15, 2024	Feature EngineeringLanguage Modeling	—Unverified
Deep Learning-Based Noninvasive Screening of Type 2 Diabetes with Chest X-ray Images and Electronic Health Records	Dec 14, 2024	DiagnosticFeature Engineering	CodeCode Available
Modeling Story Expectations to Understand Engagement: A Generative Framework Using LLMs	Dec 13, 2024	Feature EngineeringMarketing	—Unverified
Vision Transformers for Efficient Indoor Pathloss Radio Map Prediction	Dec 12, 2024	Data AugmentationFeature Engineering	—Unverified
Image-Based Malware Classification Using QR and Aztec Codes	Dec 11, 2024	Feature EngineeringMalware Classification	—Unverified
Robust Feature Engineering Techniques for Designing Efficient Motor Imagery-Based BCI-Systems	Dec 10, 2024	Brain Computer InterfaceEEG	—Unverified
RUL forecasting for wind turbine predictive maintenance based on deep learning	Dec 9, 2024	Feature EngineeringScheduling	—Unverified
Parkinson's Disease Diagnosis Through Deep Learning: A Novel LSTM-Based Approach for Freezing of Gait Detection	Dec 9, 2024	Feature EngineeringL2 Regularization	—Unverified
PRECISE: Pre-training Sequential Recommenders with Collaborative and Semantic Information	Dec 9, 2024	Feature EngineeringRecommendation Systems	—Unverified
Federated Automated Feature Engineering	Dec 5, 2024	Automated Feature EngineeringFeature Engineering	—Unverified
Deep Learning in Single-Cell and Spatial Transcriptomics Data Analysis: Advances and Challenges from a Data Science Perspective	Dec 4, 2024	Feature Engineering	—Unverified
Comparative Performance of Machine Learning Algorithms for Early Genetic Disorder and Subclass Classification	Dec 3, 2024	Feature Engineering	—Unverified
Intelligent Spark Agents: A Modular LangGraph Framework for Scalable, Visualized, and Enhanced Big Data Machine Learning Workflows	Dec 2, 2024	Decision MakingDistributed Computing	—Unverified
HiCat: A Semi-Supervised Approach for Cell Type Annotation	Nov 25, 2024	Dimensionality ReductionFeature Engineering	—Unverified
An AutoML-based approach for Network Intrusion Detection	Nov 24, 2024	AutoMLFeature Engineering	—Unverified
Enhancing Molecular Design through Graph-based Topological Reinforcement Learning	Nov 22, 2024	Drug DesignDrug Discovery	—Unverified
Understanding LLM Embeddings for Regression	Nov 22, 2024	Feature Engineeringregression	—Unverified
Advancing Heatwave Forecasting via Distribution Informed-Graph Neural Networks (DI-GNNs): Integrating Extreme Value Theory with GNNs	Nov 20, 2024	Feature EngineeringGraph Neural Network	—Unverified
Is Precise Recovery Necessary? A Task-Oriented Imputation Approach for Time Series Forecasting on Variable Subset	Nov 15, 2024	Feature EngineeringImputation	—Unverified
What makes a good BIM design: quantitative linking between design behavior and quality	Nov 14, 2024	Feature Engineering	—Unverified
GPTree: Towards Explainable Decision-Making via LLM-powered Decision Trees	Nov 13, 2024	Decision MakingFeature Engineering	—Unverified
Large Language Models for Constructing and Optimizing Machine Learning Workflows: A Survey	Nov 11, 2024	AutoMLFeature Engineering	CodeCode Available
Classification of residential and non-residential buildings based on satellite data using deep learning	Nov 11, 2024	ClassificationComputational Efficiency	—Unverified
RAGulator: Lightweight Out-of-Context Detectors for Grounded Text Generation	Nov 6, 2024	Feature EngineeringRAG	—Unverified
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level	Nov 5, 2024	Bayesian OptimisationBenchmarking	—Unverified
Correlation of Object Detection Performance with Visual Saliency and Depth Estimation	Nov 5, 2024	Depth EstimationDepth Prediction	CodeCode Available
Explainable cognitive decline detection in free dialogues with a Machine Learning approach based on pre-trained Large Language Models	Nov 4, 2024	Feature EngineeringPrompt Engineering	—Unverified
Exploring Feature Importance and Explainability Towards Enhanced ML-Based DoS Detection in AI Systems	Nov 4, 2024	Feature EngineeringFeature Importance	—Unverified
See it, Think it, Sorted: Large Multimodal Models are Few-shot Time Series Anomaly Analyzers	Nov 4, 2024	Anomaly DetectionFeature Engineering	—Unverified
Enriching Tabular Data with Contextual LLM Embeddings: A Comprehensive Ablation Study for Ensemble Classifiers	Nov 3, 2024	Ensemble LearningFeature Engineering	—Unverified
Enhancing Glucose Level Prediction of ICU Patients through Hierarchical Modeling of Irregular Time-Series	Nov 3, 2024	Data IntegrationFeature Engineering	CodeCode Available
Machine Learning Framework for Audio-Based Content Evaluation using MFCC, Chroma, Spectral Contrast, and Temporal Feature Engineering	Oct 31, 2024	Feature Engineering	—Unverified
Large Language Models Engineer Too Many Simple Features For Tabular Data	Oct 23, 2024	Feature EngineeringText Generation	CodeCode Available
Predicting 30-Day Hospital Readmission in Medicare Patients: Insights from an LSTM Deep Learning Model	Oct 23, 2024	Feature EngineeringReadmission Prediction	—Unverified
AdaptoML-UX: An Adaptive User-centered GUI-based AutoML Toolkit for Non-AI Experts and HCI Researchers	Oct 22, 2024	Automated Feature EngineeringAutoML	CodeCode Available
Molecular Topological Profile (MOLTOP) - Simple and Strong Baseline for Molecular Graph Classification	Oct 17, 2024	Feature EngineeringGraph Classification	CodeCode Available
Reproducible Machine Learning-based Voice Pathology Detection: Introducing the Pitch Difference Feature	Oct 14, 2024	Feature EngineeringVoice pathology detection	CodeCode Available
ELF-Gym: Evaluating Large Language Models Generated Features for Tabular Prediction	Oct 13, 2024	Feature Engineering	CodeCode Available
Statistical Test for Auto Feature Engineering by Selective Inference	Oct 13, 2024	Feature Engineering	CodeCode Available
Sui Generis: Large Language Models for Authorship Attribution and Verification in Latin	Oct 11, 2024	Authorship AttributionAuthorship Verification	—Unverified
Towards Trustworthy Web Attack Detection: An Uncertainty-Aware Ensemble Deep Kernel Learning Model	Oct 10, 2024	Ensemble LearningFeature Engineering	—Unverified
Principal Orthogonal Latent Components Analysis (POLCA Net)	Oct 9, 2024	Dimensionality ReductionFeature Correlation	CodeCode Available
Neural-Bayesian Program Learning for Few-shot Dialogue Intent Parsing	Oct 8, 2024	Feature EngineeringFew-Shot Learning	—Unverified
Learning to Solve Abstract Reasoning Problems with Neurosymbolic Program Synthesis and Task Generation	Oct 6, 2024	Feature EngineeringProgram Synthesis	—Unverified
Self-eXplainable AI for Medical Image Analysis: A Survey and New Outlooks	Oct 3, 2024	counterfactualCounterfactual Explanation	—Unverified
Semantic-Guided RL for Interpretable Feature Engineering	Oct 3, 2024	Automated Feature EngineeringDeep Reinforcement Learning	—Unverified
Enhancing End Stage Renal Disease Outcome Prediction: A Multi-Sourced Data-Driven Approach	Oct 2, 2024	Data IntegrationFeature Engineering	—Unverified
Automatic deductive coding in discourse analysis: an application of large language models in learning analytics	Oct 2, 2024	Feature EngineeringLanguage Modeling	CodeCode Available

Show:10 25 50

← PrevPage 6 of 35Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	CNN	14 gestures accuracy	0.98	—	Unverified