SOTAVerified

Prompt Engineering

Prompt engineering is the process of designing and refining the prompts used to generate text from language models, such as GPT-3 or similar models. The goal of prompt engineering is to improve the quality and relevance of the generated text by carefully crafting the prompts to elicit the desired responses from the model.

Prompt engineering involves several steps, including selecting the appropriate model architecture and parameters, designing the prompt format and structure, selecting the appropriate task and training data, and fine-tuning the model using the selected prompt and data.

Prompt engineering is a crucial step in the development of language models, as it can greatly influence the quality and effectiveness of the model's responses. By carefully designing and refining the prompts used to generate text, researchers and developers can improve the accuracy and relevance of the model's output, making it more useful for a wide range of applications, including chatbots, language translation, content creation, and more.

Papers

Showing 351400 of 1236 papers

TitleStatusHype
Generative Language Models Potential for Requirement Engineering Applications: Insights into Current Strengths and Limitations0
Generative AI Literacy: Twelve Defining Competencies0
MATATA: Weakly Supervised End-to-End MAthematical Tool-Augmented Reasoning for Tabular Applications0
Structured Object Language Modeling (SoLM): Native Structured Objects Generation Conforming to Complex Schemas with Self-Supervised Denoising0
Human Evaluation of Procedural Knowledge Graph Extraction from Text with Large Language Models0
QuaLLM-Health: An Adaptation of an LLM-Based Framework for Quantitative Data Extraction from Online Health DiscussionsCode0
Don't Command, Cultivate: An Exploratory Study of System-2 AlignmentCode0
ThreatModeling-LLM: Automating Threat Modeling using Large Language Models for Banking System0
ASSERTIFY: Utilizing Large Language Models to Generate Assertions for Production CodeCode0
Noise Diffusion for Enhancing Semantic Faithfulness in Text-to-Image SynthesisCode0
Med-PerSAM: One-Shot Visual Prompt Tuning for Personalized Segment Anything Model in Medical DomainCode0
Instruct or Interact? Exploring and Eliciting LLMs' Capability in Code Snippet Adaptation Through Prompt EngineeringCode0
Optimizing Social Media Annotation of HPV Vaccine Skepticism and Misinformation Using Large Language Models: An Experimental Evaluation of In-Context Learning and Fine-Tuning Stance Detection Across Multiple Models0
FTA generation using GenAI with an Autonomy sensor Usecase0
BiomedCoOp: Learning to Prompt for Biomedical Vision-Language ModelsCode2
LEADRE: Multi-Faceted Knowledge Enhanced LLM Empowered Display Advertisement Recommender System0
Robust Planning with Compound LLM Architectures: An LLM-Modulo ApproachCode1
Evaluating LLMs Capabilities Towards Understanding Social Dynamics0
GRL-Prompt: Towards Knowledge Graph based Prompt Optimization via Reinforcement Learning0
Can ChatGPT Overcome Behavioral Biases in the Financial Sector? Classify-and-Rethink: Multi-Step Zero-Shot Reasoning in the Gold Investment0
From Text to Pose to Image: Improving Diffusion Model Control and QualityCode2
Can Open-source LLMs Enhance Data Synthesis for Toxic Detection?: An Experimental Study0
Large Language Model for Qualitative Research -- A Systematic Mapping Study0
LLM-IE: A Python Package for Generative Information Extraction with Large Language Models0
Large corpora and large language models: a replicable method for automating grammatical annotation0
Empowering Meta-Analysis: Leveraging Large Language Models for Scientific SynthesisCode0
Large Language Models (LLMs) as Traffic Control Systems at Urban Intersections: A New Paradigm0
Generating Compositional Scenes via Text-to-image RGBA Instance Generation0
Can Artificial Intelligence Generate Quality Research Topics Reflecting Patient Concerns?0
How Good is ChatGPT at Audiovisual Deepfake Detection: A Comparative Study of ChatGPT, AI Models and Human Perception0
Responsible AI in Construction Safety: Systematic Evaluation of Large Language Models and Prompt Engineering0
LLMStinger: Jailbreaking LLMs using RL fine-tuned LLMs0
TIPO: Text to Image with Text Presampling for Prompt OptimizationCode2
Efficient and Accurate Prompt Optimization: the Benefit of Memory in Exemplar-Guided Reflection0
Evaluating ChatGPT-3.5 Efficiency in Solving Coding Problems of Different Complexity Levels: An Empirical AnalysisCode0
Likelihood as a Performance Gauge for Retrieval-Augmented GenerationCode0
Reverse Prompt Engineering0
Leveraging Retrieval-Augmented Generation for Persian University Knowledge Retrieval0
Qwen2.5-32B: Leveraging Self-Consistent Tool-Integrated Reasoning for Bengali Mathematical Olympiad Problem Solving0
LLMs as Method Actors: A Model for Prompt Engineering and ArchitectureCode1
Web Archives Metadata Generation with GPT-4o: Challenges and InsightsCode0
Leveraging LLMs to Enable Natural Language Search on Go-to-market Platforms0
Selecting Between BERT and GPT for Text Classification in Political Science Research0
From Medprompt to o1: Exploration of Run-Time Strategies for Medical Challenge Problems and Beyond0
MRJ-Agent: An Effective Jailbreak Agent for Multi-Round Dialogue0
Prompt Engineering Using GPT for Word-Level Code-Mixed Language Identification in Low-Resource Dravidian Languages0
Automated, LLM enabled extraction of synthesis details for reticular materials from scientific literature0
DiffLM: Controllable Synthetic Data Generation via Diffusion Language Models0
Leveraging Vision-Language Models for Manufacturing Feature Recognition in CAD Designs0
Benchmarking Vision, Language, & Action Models on Robotic Learning TasksCode1
Show:102550
← PrevPage 8 of 25Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PromptKDHarmonic mean77.62Unverified
2Customized EnsembleHarmonic mean75.49Unverified
3MMRLHarmonic mean74.45Unverified
4MMRL++Harmonic mean74.44Unverified
5CoPromptHarmonic mean74.33Unverified
6HPT++Harmonic mean74.24Unverified
7HPTHarmonic mean74.17Unverified
8ProMetaRHarmonic mean74.09Unverified
9MetaPromptHarmonic mean74.02Unverified
10DePTHarmonic mean74.02Unverified
#ModelMetricClaimedVerifiedStatus
1PromptKDHarmonic mean97.77Unverified
2HPT++Harmonic mean96.96Unverified
3MMRL++Harmonic mean96.75Unverified
4MMRLHarmonic mean96.68Unverified
5HPTHarmonic mean96.65Unverified
6CoPromptHarmonic mean96.55Unverified
7MetaPromptHarmonic mean96.32Unverified
8DePTHarmonic mean96.28Unverified
9ProMetaRHarmonic mean96.16Unverified
10RPOHarmonic mean96.03Unverified
#ModelMetricClaimedVerifiedStatus
1PromptKDHarmonic mean77.94Unverified
2MMRL++Harmonic mean74.46Unverified
3HPT++Harmonic mean74.23Unverified
4MMRLHarmonic mean73.82Unverified
5CoPromptHarmonic mean72.79Unverified
6ProMetaRHarmonic mean72.31Unverified
7HPTHarmonic mean72.16Unverified
8PromptSRCHarmonic mean71.75Unverified
9DePTHarmonic mean71.09Unverified
10RPOHarmonic mean68.61Unverified
#ModelMetricClaimedVerifiedStatus
1MMRL++Harmonic mean91.94Unverified
2PromptKDHarmonic mean89.14Unverified
3HPT++Harmonic mean87.36Unverified
4MMRLHarmonic mean87.21Unverified
5CoPromptHarmonic mean85.84Unverified
6ProMetaRHarmonic mean85.3Unverified
7DePTHarmonic mean84.88Unverified
8HPTHarmonic mean84.82Unverified
9MetaPromptHarmonic mean83.38Unverified
10MaPLeHarmonic mean82.35Unverified
#ModelMetricClaimedVerifiedStatus
1PromptKDHarmonic mean45.17Unverified
2MMRL++Harmonic mean42.24Unverified
3HPT++Harmonic mean41.33Unverified
4MMRLHarmonic mean41.15Unverified
5DePTHarmonic mean40.73Unverified
6HPTHarmonic mean40.28Unverified
7ProMetaRHarmonic mean40.25Unverified
8PromptSRCHarmonic mean40.15Unverified
9CoPromptHarmonic mean39.76Unverified
10MetaPromptHarmonic mean38.24Unverified
#ModelMetricClaimedVerifiedStatus
1PromptKDHarmonic mean90.24Unverified
2HPTHarmonic mean87.16Unverified
3MMRL++Harmonic mean87.01Unverified
4MMRLHarmonic mean86.78Unverified
5ProMetaRHarmonic mean86.7Unverified
6DePTHarmonic mean86.46Unverified
7PromptSRCHarmonic mean85.95Unverified
8HPT++Harmonic mean85.85Unverified
9CoPromptHarmonic mean85.71Unverified
10MetaPromptHarmonic mean84.52Unverified
#ModelMetricClaimedVerifiedStatus
1PromptKDHarmonic mean97.15Unverified
2HPT++Harmonic mean96.91Unverified
3CoPromptHarmonic mean96.87Unverified
4MMRLHarmonic mean96.74Unverified
5HPTHarmonic mean96.71Unverified
6MaPLeHarmonic mean96.58Unverified
7MMRL++Harmonic mean96.51Unverified
8ProMetaRHarmonic mean96.49Unverified
9CoCoOpHarmonic mean96.43Unverified
10DePTHarmonic mean96.37Unverified
#ModelMetricClaimedVerifiedStatus
1PromptKDHarmonic mean83.13Unverified
2MMRL++Harmonic mean78.18Unverified
3MMRLHarmonic mean78.06Unverified
4DePTHarmonic mean77.79Unverified
5ProMetaRHarmonic mean76.72Unverified
6PromptSRCHarmonic mean76.58Unverified
7CoPromptHarmonic mean75.66Unverified
8HPT++Harmonic mean75.59Unverified
9HPTHarmonic mean75.57Unverified
10MetaPromptHarmonic mean75.48Unverified
#ModelMetricClaimedVerifiedStatus
1PromptKDHarmonic mean82.6Unverified
2CoPromptHarmonic mean81.31Unverified
3MMRL++Harmonic mean81.28Unverified
4MMRLHarmonic mean81.2Unverified
5HPT++Harmonic mean81.11Unverified
6DePTHarmonic mean81.06Unverified
7HPTHarmonic mean80.88Unverified
8ProMetaRHarmonic mean80.82Unverified
9MetaPromptHarmonic mean80.62Unverified
10PromptSRCHarmonic mean80.52Unverified
#ModelMetricClaimedVerifiedStatus
1PromptKDHarmonic mean86.1Unverified
2MMRLHarmonic mean83.89Unverified
3HPT++Harmonic mean83.81Unverified
4MMRL++Harmonic mean83.81Unverified
5ProMetaRHarmonic mean83.25Unverified
6HPTHarmonic mean83.16Unverified
7CoPromptHarmonic mean83.07Unverified
8PromptSRCHarmonic mean82.74Unverified
9DePTHarmonic mean82.46Unverified
10MetaPromptHarmonic mean81.35Unverified
#ModelMetricClaimedVerifiedStatus
1PromptKDHarmonic mean93.05Unverified
2CoPromptHarmonic mean91.4Unverified
3MaPLeHarmonic mean91.38Unverified
4ProMetaRHarmonic mean91.34Unverified
5MetaPromptHarmonic mean91.29Unverified
6DePTHarmonic mean91.22Unverified
7MMRL++Harmonic mean91.1Unverified
8PromptSRCHarmonic mean91.1Unverified
9HPT++Harmonic mean91.09Unverified
10MMRLHarmonic mean91.03Unverified
#ModelMetricClaimedVerifiedStatus
1POMPTop-1 accuracy %51.6Unverified
2MMRLTop-1 accuracy %51.2Unverified
3HPT++Top-1 accuracy %51.18Unverified
4MaPLeTop-1 accuracy %50.9Unverified
5PromptSRCTop-1 accuracy %50.9Unverified
6HPTTop-1 accuracy %50.85Unverified
7CoCoOpTop-1 accuracy %50.63Unverified
8CoPromptTop-1 accuracy %50.5Unverified
9CLIPTop-1 accuracy %47.77Unverified
#ModelMetricClaimedVerifiedStatus
1POMPTop-1 accuracy %77.9Unverified
2PromptSRCTop-1 accuracy %77.8Unverified
3MMRLTop-1 accuracy %77.53Unverified
4HPT++Top-1 accuracy %77.52Unverified
5CoPromptTop-1 accuracy %77.51Unverified
6HPTTop-1 accuracy %77.38Unverified
7MaPLeTop-1 accuracy %76.98Unverified
8CoCoOPTop-1 accuracy %76.18Unverified
9CLIPTop-1 accuracy %73.96Unverified
#ModelMetricClaimedVerifiedStatus
1POMPTop-1 accuracy %49.8Unverified
2PromptSRCTop-1 accuracy %49.55Unverified
3CoPromptTop-1 accuracy %49.43Unverified
4HPTTop-1 accuracy %49.36Unverified
5HPT++Top-1 accuracy %49.28Unverified
6MMRLTop-1 accuracy %49.17Unverified
7MaPLeTop-1 accuracy %49.15Unverified
8CoCoOpTop-1 accuracy %48.75Unverified
9CLIPTop-1 accuracy %46.15Unverified
#ModelMetricClaimedVerifiedStatus
1HPT++Top-1 accuracy %65.31Unverified
2HPTTop-1 accuracy %65.25Unverified
3MMRLTop-1 accuracy %64.47Unverified
4PromptSRCTop-1 accuracy %64.35Unverified
5CoCoOpTop-1 accuracy %64.07Unverified
6MaPLeTop-1 accuracy %64.07Unverified
7POMPTop-1 accuracy %63.8Unverified
8CLIPTop-1 accuracy %60.83Unverified
#ModelMetricClaimedVerifiedStatus
1POMPAccuracy25.3Unverified
2VPTAccuracy24.8Unverified