Skip to content

aloth/JudgeGPT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

JudgeGPT - Evaluating AI-Generated News Authenticity

JudgeGPT is a research project focused on critically evaluating news content generated by AI, exploring the nuanced perceptions of authenticity in machine-generated vs. human-generated news. This pioneering work aims to gather insights on how individuals discern between real and artificial news content in an era of rapidly advancing language models.

🔍 Want to test your judgment? If you're interested in seeing this application in action and would like to participate in the evaluation of fake news, please visit our interactive survey at https://judgegpt.streamlit.app/. A port to React is currently in development, and can be beta tested at https://aka.ms/JudgeGPT.

🧠 Curious about spotting fake news? If you want to learn more about spotting fake news, especially during election periods, check out this blog post for tips and insights.

Project Overview

JudgeGPT's core is an interactive survey platform built with Streamlit, inviting participants to analyze news fragments and assess their perceived origin (human or machine-generated) and authenticity. This process is crucial for:

  1. Understanding public perception of AI-generated content
  2. Refining AI detection methodologies
  3. Improving the quality of AI-generated news

The project complements our sister initiative, RogueGPT, which focuses on the generation of news content.

About the Name: JudgeGPT

The name JudgeGPT is thoughtfully chosen to reflect the core objective of this research project. The term "GPT" is employed in a pars pro toto manner, where it denotes not just the Generative Pre-trained Transformer models developed by OpenAI but extends to cover a broad spectrum of Large Language Models (LLMs). This choice signifies that while the project may initially focus on content generated by GPT models, it is inherently designed to evaluate news fragments produced by any advanced LLMs. The word "Judge" is used, as it directly relates to the action performed by participants within the project. Attendees are invited to judge the news fragments presented to them, determining their authenticity (real vs. fake) and origin (human-generated vs. machine-generated).

Key Components

  • app.py: The main application script that powers the Streamlit web interface, facilitating the survey process, data collection, and interaction with a MongoDB database for result storage.

  • requirements.txt: A simple file listing all necessary Python packages to ensure easy setup and deployment of the JudgeGPT application.

  • Database integration for data storage and retrieval.

Installation

To participate in the development of JudgeGPT, follow these steps:

  1. Clone the repository to your local machine:

    git clone https://github.com/aloth/JudgeGPT.git
    cd JudgeGPT
    
  2. Install the required dependencies listed in requirements.txt using pip:

    pip install -r requirements.txt
    
  3. Launch the Streamlit application:

    streamlit run app.py
    

Usage

Upon running the application, users are presented with a series of news fragments retrieved from a MongoDB database. Participants are asked to:

  1. Read each news fragment.
  2. Use the sliders to rate your perception of:
    • Authenticity (Real vs. Fake)
    • Source (Human vs. Machine)
  3. Submit their response, contributing to the research dataset.

This iterative process allows for the collection of valuable data on news authenticity perceptions, feeding into analytical studies aimed at improving AI news generation and detection frameworks.

Language Support and Language Detection

JudgeGPT aims to provide a personalized user experience by automatically determining your language preference to tailor the survey content accordingly. However, should you wish to manually set your preferred language, you can easily change this in the app. Furthermore, it is possible to specify the user language through URL parameters. For instance, to set the language to German, you can use the URL https://judgegpt.streamlit.app/?language=de, or for French, https://judgegpt.streamlit.app/?language=fr.

Currently, JudgeGPT supports the following languages:

Customizing Age Range

You can customize the age range for participants by adding min_age and max_age parameters to the URL. For example, to limit participants to ages 15-25, use the following URL:

https://judgegpt.streamlit.app/?min_age=15&max_age=25

Project Status

As an early work in progress, JudgeGPT is continuously evolving, with updates and improvements being made regularly. The goal is to expand the scope of the survey, enhance the user interface, and deepen the analytical aspects of the project to provide richer insights into the dynamics of news authenticity in the age of AI.

Contributing

We welcome contributions to JudgeGPT from the community! To get involved:

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

For major changes, please open an issue first to discuss what you would like to change.

Future Directions and Ideas for Implementation

While JudgeGPT has established a robust framework for evaluating perceptions of news authenticity, several enhancements are planned to deepen user engagement, enrich the experience, and provide more nuanced insights from the data collected. These ideas include:

  • Gamification: Introducing gamification elements such as scoring systems, achievement badges, and leaderboards to incentivize participation. Users could earn points for accurately identifying fake vs. real news, with higher scores unlocking new levels or rewards. This approach not only encourages sustained engagement but also promotes a competitive and educational environment.

  • Interactive Results Visualization: Developing an interactive dashboard using tools like Power BI or D3.js, where participants can view real-time results and insights derived from the collective data. This feature would enhance transparency, enabling users to explore trends, patterns, and correlations in news perception across different demographics and regions. For instance, users could see how their accuracy compares to others globally or within specific regions.

  • Personalized Feedback Mechanisms: Implementing personalized feedback systems that analyze a user's performance over time. For example, users could receive detailed reports on their accuracy rates, the types of news they struggle to identify, and how their perceptions align or diverge from broader trends. Leveraging machine learning algorithms, we could offer tailored tips on improving news discernment skills, fostering a more informed and critical user base.

  • Social Media Integration: Integrating with social media platforms (e.g., X, LinkedIn, Facebook, and Instagram) to allow users to share their achievement badges and scores directly. This not only increases the visibility and reach of the survey but also encourages more people to participate, contributing to a broader dataset for analysis. Additionally, this integration could utilize OAuth for secure authentication and seamless user experience.

  • UI/UX Enhancements: Refining the user interface and experience to make the platform more intuitive and accessible. This could involve redesigning the layout for better navigation, optimizing the platform for mobile devices, and improving load times. Leveraging frameworks like React or Vue.js could facilitate these improvements, ensuring a responsive and user-friendly interface.

  • Localization and Multilingual Support: Expanding the platform to support multiple languages and regional content, including complex languages like Chinese. This would involve not only translating the interface but also adapting content to reflect regional news and cultural nuances. Utilizing i18n libraries and collaborating with native speakers or experts could ensure accuracy and relevance, thus broadening the research scope and enabling data collection from a more diverse global audience.

  • Scalability and Production Environment: Transitioning JudgeGPT to a scalable production environment using platforms like Microsoft Azure. This would involve containerizing the application with Docker, deploying it via Kubernetes for orchestration, and ensuring high availability through Azure's cloud infrastructure. This setup would allow for seamless scaling as user demand grows, while also providing robust security and compliance features.

  • Image Support: In addition to fake news, we plan to allow users to evaluate real photos vs. deepfakes. This feature will enable participants to assess the authenticity of images, expanding the scope of the project to cover visual misinformation. Leveraging computer vision techniques and deep learning models, this enhancement will provide users with new challenges and further contribute to the understanding of how AI can both create and detect fake visual content.

License

JudgeGPT is open-source and available under the GNU GPLv3 License. For more details, see the LICENSE file in the repository.

Acknowledgments

  • OpenAI for their groundbreaking GPT models
  • Streamlit for enabling rapid development of our web application
  • MongoDB for robust database solutions
  • The open-source community for invaluable tools and libraries

Disclaimer

JudgeGPT is an independent research project and is not affiliated with, endorsed by, or in any way officially connected to OpenAI. The use of "GPT" within our project name is purely for descriptive purposes, indicating the use of generative pre-trained transformer models as a core technology in our research. Our project's explorations and findings are our own and do not reflect the views or positions of OpenAI or its collaborators. We are committed to responsible AI research and adhere to ethical guidelines in all aspects of our work, including the generation and analysis of content.


Star JudgeGPT on GitHub if you find it interesting! This helps us reach a wider audience.

📧 Get Involved! Participate in the survey, contribute to the code, or open an issue to report problems, suggest features, or ask questions. We appreciate your feedback and support!