Skip to content

An R package to scrape conversations from publicly shared ChatGPT links

License

Notifications You must be signed in to change notification settings

sonsoleslp/chatgptscrapeR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

chatgptscrapeR : An R package to scrape conversations from publicly shared ChatGPT links

Installation

You can install the development version of chatgptscrapeR from GitHub with:

# install.packages("devtools")
devtools::install_github("sonsoleslp/chatgptscrapeR")

Example

Load the library

library("chatgptscrapeR")
library("dplyr")
library("tidyr")

Scrape one conversation

# URL of the ChatGPT conversation
url1 <- "https://chatgpt.com/share/c7a912da-5dd3-4e46-9fd7-d777006068ff"
url2 <- "https://chatgpt.com/share/c302f7d3-9d89-4d93-ab89-68c0bfd0b667"

# Scrape the conversation
conversation <- scrape_chatgpt(url1)
select(conversation, order, message.author.role, message.content.parts)

order message.author.role message.content.parts
system
1 user Why is R better than Python?
2 assistant Choosing between R and Python often depends on the specific needs and context of a project, as both have their strengths and weaknesses. Here are some reasons why R might be considered better than Python for certain tasks:

1. Specialization in Statistical Analysis and Data Visualization

  • Comprehensive Statistical Packages: R has a wide array of packages and functions specifically designed for statistical analysis, making it highly specialized for tasks that require complex statistical computations.
  • Advanced Data Visualization: R's ggplot2 and other visualization libraries are renowned for their capabilities in creating detailed and aesthetically pleasing visualizations. It often offers more flexibility and customization options for statistical graphics compared to Python.

2. Built for Data Analysis

  • Integrated Development Environment (IDE): RStudio is a powerful IDE specifically designed for R, which enhances productivity with features like syntax highlighting, direct package management, and advanced plotting capabilities.
  • Community and Resources: R has a strong community focused on data analysis, with a wealth of resources, tutorials, and forums dedicated to statistical methodologies and best practices.

3. Domain-Specific Libraries

  • CRAN (Comprehensive R Archive Network): CRAN hosts thousands of packages tailored for various types of statistical analysis, bioinformatics, social science, and other specialized domains.
  • Bioconductor: For bioinformatics, R's Bioconductor repository offers extensive tools specifically for analyzing genomic data.

4. Statistical Accuracy and Methods

  • Proven Track Record in Academia: R has been widely adopted in academia for statistical research, ensuring that its methods are robust and reliable.
  • Wide Range of Statistical Tests: R provides extensive built-in functions for a vast array of statistical tests and methods not always available in Python.

5. Data Manipulation

  • dplyr and Data Frames: R's dplyr package provides a powerful grammar for data manipulation, making tasks like filtering, transforming, and summarizing data more intuitive and concise. The data.table package offers high performance for handling large datasets.

6. Reproducible Research

  • Knitr and RMarkdown: R excels in creating reproducible research documents. With knitr and RMarkdown, users can seamlessly integrate code, results, and narrative text into a single document, facilitating reproducibility and transparency.

7. Time Series Analysis

  • Specialized Packages: R has specialized packages like forecast, xts, and zoo for time series analysis, making it a preferred choice for economists, financial analysts, and researchers working with temporal data.

8. Statistical Reporting

  • Reporting Tools: R's tools for generating automated reports and dashboards (e.g., Shiny for interactive web applications) are highly effective for sharing analysis results.

Conclusion

While Python is a versatile general-purpose programming language with a strong ecosystem for data science, machine learning, and web development, R's strength lies in its specialized focus on statistical analysis and data visualization. The choice between R and Python should be based on the specific needs of your project and your personal or team's expertise and preferences. For tasks heavily reliant on statistical methods and complex data visualizations, R often proves to be more efficient and effective.

Scrape an array of conversations

urls <- c(url1, url2)

# Scrape the conversations
conversation <- scrape_chatgpt(urls)
select(conversation, order, title, message.author.role, message.content.parts)

order title message.author.role message.content.parts
R vs Python Benefits
R vs Python Benefits system
1 R vs Python Benefits user Why is R better than Python?
2 R vs Python Benefits assistant Choosing between R and Python often depends on the specific needs and context of a project, as both have their strengths and weaknesses. Here are some reasons why R might be considered better than Python for certain tasks:

1. Specialization in Statistical Analysis and Data Visualization

  • Comprehensive Statistical Packages: R has a wide array of packages and functions specifically designed for statistical analysis, making it highly specialized for tasks that require complex statistical computations.
  • Advanced Data Visualization: R's ggplot2 and other visualization libraries are renowned for their capabilities in creating detailed and aesthetically pleasing visualizations. It often offers more flexibility and customization options for statistical graphics compared to Python.

2. Built for Data Analysis

  • Integrated Development Environment (IDE): RStudio is a powerful IDE specifically designed for R, which enhances productivity with features like syntax highlighting, direct package management, and advanced plotting capabilities.
  • Community and Resources: R has a strong community focused on data analysis, with a wealth of resources, tutorials, and forums dedicated to statistical methodologies and best practices.

3. Domain-Specific Libraries

  • CRAN (Comprehensive R Archive Network): CRAN hosts thousands of packages tailored for various types of statistical analysis, bioinformatics, social science, and other specialized domains.
  • Bioconductor: For bioinformatics, R's Bioconductor repository offers extensive tools specifically for analyzing genomic data.

4. Statistical Accuracy and Methods

  • Proven Track Record in Academia: R has been widely adopted in academia for statistical research, ensuring that its methods are robust and reliable.
  • Wide Range of Statistical Tests: R provides extensive built-in functions for a vast array of statistical tests and methods not always available in Python.

5. Data Manipulation

  • dplyr and Data Frames: R's dplyr package provides a powerful grammar for data manipulation, making tasks like filtering, transforming, and summarizing data more intuitive and concise. The data.table package offers high performance for handling large datasets.

6. Reproducible Research

  • Knitr and RMarkdown: R excels in creating reproducible research documents. With knitr and RMarkdown, users can seamlessly integrate code, results, and narrative text into a single document, facilitating reproducibility and transparency.

7. Time Series Analysis

  • Specialized Packages: R has specialized packages like forecast, xts, and zoo for time series analysis, making it a preferred choice for economists, financial analysts, and researchers working with temporal data.

8. Statistical Reporting

  • Reporting Tools: R's tools for generating automated reports and dashboards (e.g., Shiny for interactive web applications) are highly effective for sharing analysis results.

Conclusion

While Python is a versatile general-purpose programming language with a strong ecosystem for data science, machine learning, and web development, R's strength lies in its specialized focus on statistical analysis and data visualization. The choice between R and Python should be based on the specific needs of your project and your personal or team's expertise and preferences. For tasks heavily reliant on statistical methods and complex data visualizations, R often proves to be more efficient and effective.

Why Spain is Great
Why Spain is Great system
1 Why Spain is Great user Why is Spain the best country in Europe?
2 Why Spain is Great assistant Spain is often considered one of the best countries in Europe for various reasons, reflecting its rich culture, history, natural beauty, and quality of life. Here are several factors that contribute to Spain's appeal:

1. Rich Cultural Heritage

Spain boasts a diverse cultural heritage influenced by various civilizations, including the Romans, Moors, and Visigoths. Cities like Barcelona, Madrid, Seville, and Granada offer architectural marvels such as the Sagrada Familia, Alhambra, and the Royal Palace.

2. Vibrant Festivals

Spain is famous for its lively festivals, including La Tomatina, Running of the Bulls in Pamplona, Las Fallas in Valencia, and Semana Santa (Holy Week) processions. These events showcase the country's festive spirit and traditions.

3. Delicious Cuisine

Spanish cuisine is renowned worldwide. Tapas, paella, jamón ibérico, and a variety of regional dishes offer a culinary journey. The Mediterranean diet, prevalent in Spain, is also considered one of the healthiest diets globally.

4. Beautiful Landscapes

Spain's geography is incredibly diverse, featuring stunning beaches along the Costa del Sol and Costa Brava, mountainous regions like the Pyrenees and Sierra Nevada, and picturesque countryside in regions like Andalusia and Castilla-La Mancha.

5. Excellent Climate

Spain enjoys a Mediterranean climate with warm summers and mild winters, especially along the coastal areas. The Canary Islands offer a subtropical climate, making Spain a year-round destination.

6. High Quality of Life

Spain ranks high in quality of life, with excellent healthcare, a relaxed lifestyle, and a strong sense of community. The cost of living in Spain is relatively affordable compared to other Western European countries.

7. Art and Literature

Spain has a rich artistic and literary heritage, producing world-renowned artists like Pablo Picasso, Salvador Dalí, and Francisco Goya, as well as literary figures like Miguel de Cervantes, the author of "Don Quixote."

8. Passion for Sports

Spain is passionate about sports, particularly football (soccer). The country is home to some of the most famous football clubs in the world, such as FC Barcelona and Real Madrid. Spain has also excelled in other sports, including tennis, basketball, and motor racing.

9. Welcoming People

Spaniards are known for their hospitality and friendliness. The strong emphasis on family and social connections makes visitors feel welcome and integrated into the local culture.

10. Historical Significance

Spain has played a pivotal role in world history, from the era of the Spanish Empire, which influenced large parts of the world, to its significant contributions to art, science, and exploration.

These factors, among others, contribute to Spain's reputation as one of the best countries in Europe. Its combination of cultural richness, natural beauty, and quality of life offers something for everyone.

3 Why Spain is Great user Which city in Spain is the best?
4 Why Spain is Great assistant The "best" city in Spain can be subjective, as it depends on personal preferences and what one is looking for in a city. However, a few cities stand out for their unique offerings, cultural significance, and overall appeal:

1. Barcelona

  • Cultural Attractions: Home to the works of Antoni Gaudí, including the Sagrada Familia, Park Güell, and Casa Batlló.
  • Beaches: Offers beautiful urban beaches like Barceloneta.
  • Vibrant Lifestyle: Known for its lively street life, festivals, and nightlife.
  • Cuisine: Famous for its tapas bars, seafood, and Catalan cuisine.

2. Madrid

  • Capital City: As the capital, it has a bustling, cosmopolitan atmosphere.
  • Museums: Houses world-class museums like the Prado Museum, Reina Sofia Museum, and Thyssen-Bornemisza Museum.
  • Royal Palace: Offers a glimpse into Spain's royal heritage.
  • Parks: Retiro Park and Casa de Campo provide green spaces within the city.

3. Seville

  • Historical Sites: Known for its stunning Gothic Seville Cathedral, Alcázar Palace, and the Giralda tower.
  • Flamenco: The heart of flamenco music and dance.
  • Festivals: Famous for Semana Santa (Holy Week) and Feria de Abril (April Fair).
  • Traditional Charm: Narrow, winding streets and picturesque neighborhoods like Santa Cruz.

4. Granada

  • Alhambra: The magnificent Moorish palace and fortress is a UNESCO World Heritage site.
  • Historic Districts: Albaicín and Sacromonte are rich in history and culture.
  • Student City: Home to a large student population, which adds vibrancy and affordability.

5. Valencia

  • Modern Architecture: The City of Arts and Sciences complex is a futuristic architectural marvel.
  • Beaches: Offers beautiful beaches like Malvarrosa.
  • Paella: Known as the birthplace of paella, offering the best of this iconic dish.
  • Festivals: Famous for Las Fallas, a spectacular festival with elaborate parades and fireworks.

6. Bilbao

  • Guggenheim Museum: A major draw for contemporary art lovers.
  • Revitalized City: Known for its transformation from an industrial city to a cultural hotspot.
  • Basque Cuisine: Offers unique and delicious Basque dishes.

7. San Sebastián (Donostia)

  • Beaches: La Concha Beach is considered one of the best city beaches in Europe.
  • Cuisine: Known for its pintxos (Basque tapas) and high concentration of Michelin-starred restaurants.
  • Film Festival: Hosts an internationally renowned film festival.

Choosing the Best City

Ultimately, the best city in Spain depends on what you're looking for:

  • For Art and Architecture: Barcelona
  • For Museums and Nightlife: Madrid
  • For History and Tradition: Seville
  • For Moorish Heritage: Granada
  • For Modern Attractions: Valencia
  • For Contemporary Culture: Bilbao
  • For Beaches and Cuisine: San Sebastián

Each city offers a unique experience and highlights different aspects of Spain's diverse culture and history.

Scrape a dataframe containing conversation URLs

df <-  data.frame(user = c("User1", "User2"), 
                  age = c(15, 14),
                  gender = c("M", "F"),
                  url = urls)

conversation <- scrape_chatgpt(df, column = "url") |> filter(!is.na(order))

order user title message.author.role message.content.parts
1 User1 R vs Python Benefits user Why is R better than Python?
2 User1 R vs Python Benefits assistant Choosing between R and Python often depends on the specific needs and context of a project, as both have their strengths and weaknesses. Here are some reasons why R might be considered better than Python for certain tasks:

1. Specialization in Statistical Analysis and Data Visualization

  • Comprehensive Statistical Packages: R has a wide array of packages and functions specifically designed for statistical analysis, making it highly specialized for tasks that require complex statistical computations.
  • Advanced Data Visualization: R's ggplot2 and other visualization libraries are renowned for their capabilities in creating detailed and aesthetically pleasing visualizations. It often offers more flexibility and customization options for statistical graphics compared to Python.

2. Built for Data Analysis

  • Integrated Development Environment (IDE): RStudio is a powerful IDE specifically designed for R, which enhances productivity with features like syntax highlighting, direct package management, and advanced plotting capabilities.
  • Community and Resources: R has a strong community focused on data analysis, with a wealth of resources, tutorials, and forums dedicated to statistical methodologies and best practices.

3. Domain-Specific Libraries

  • CRAN (Comprehensive R Archive Network): CRAN hosts thousands of packages tailored for various types of statistical analysis, bioinformatics, social science, and other specialized domains.
  • Bioconductor: For bioinformatics, R's Bioconductor repository offers extensive tools specifically for analyzing genomic data.

4. Statistical Accuracy and Methods

  • Proven Track Record in Academia: R has been widely adopted in academia for statistical research, ensuring that its methods are robust and reliable.
  • Wide Range of Statistical Tests: R provides extensive built-in functions for a vast array of statistical tests and methods not always available in Python.

5. Data Manipulation

  • dplyr and Data Frames: R's dplyr package provides a powerful grammar for data manipulation, making tasks like filtering, transforming, and summarizing data more intuitive and concise. The data.table package offers high performance for handling large datasets.

6. Reproducible Research

  • Knitr and RMarkdown: R excels in creating reproducible research documents. With knitr and RMarkdown, users can seamlessly integrate code, results, and narrative text into a single document, facilitating reproducibility and transparency.

7. Time Series Analysis

  • Specialized Packages: R has specialized packages like forecast, xts, and zoo for time series analysis, making it a preferred choice for economists, financial analysts, and researchers working with temporal data.

8. Statistical Reporting

  • Reporting Tools: R's tools for generating automated reports and dashboards (e.g., Shiny for interactive web applications) are highly effective for sharing analysis results.

Conclusion

While Python is a versatile general-purpose programming language with a strong ecosystem for data science, machine learning, and web development, R's strength lies in its specialized focus on statistical analysis and data visualization. The choice between R and Python should be based on the specific needs of your project and your personal or team's expertise and preferences. For tasks heavily reliant on statistical methods and complex data visualizations, R often proves to be more efficient and effective.

1 User2 Why Spain is Great user Why is Spain the best country in Europe?
2 User2 Why Spain is Great assistant Spain is often considered one of the best countries in Europe for various reasons, reflecting its rich culture, history, natural beauty, and quality of life. Here are several factors that contribute to Spain's appeal:

1. Rich Cultural Heritage

Spain boasts a diverse cultural heritage influenced by various civilizations, including the Romans, Moors, and Visigoths. Cities like Barcelona, Madrid, Seville, and Granada offer architectural marvels such as the Sagrada Familia, Alhambra, and the Royal Palace.

2. Vibrant Festivals

Spain is famous for its lively festivals, including La Tomatina, Running of the Bulls in Pamplona, Las Fallas in Valencia, and Semana Santa (Holy Week) processions. These events showcase the country's festive spirit and traditions.

3. Delicious Cuisine

Spanish cuisine is renowned worldwide. Tapas, paella, jamón ibérico, and a variety of regional dishes offer a culinary journey. The Mediterranean diet, prevalent in Spain, is also considered one of the healthiest diets globally.

4. Beautiful Landscapes

Spain's geography is incredibly diverse, featuring stunning beaches along the Costa del Sol and Costa Brava, mountainous regions like the Pyrenees and Sierra Nevada, and picturesque countryside in regions like Andalusia and Castilla-La Mancha.

5. Excellent Climate

Spain enjoys a Mediterranean climate with warm summers and mild winters, especially along the coastal areas. The Canary Islands offer a subtropical climate, making Spain a year-round destination.

6. High Quality of Life

Spain ranks high in quality of life, with excellent healthcare, a relaxed lifestyle, and a strong sense of community. The cost of living in Spain is relatively affordable compared to other Western European countries.

7. Art and Literature

Spain has a rich artistic and literary heritage, producing world-renowned artists like Pablo Picasso, Salvador Dalí, and Francisco Goya, as well as literary figures like Miguel de Cervantes, the author of "Don Quixote."

8. Passion for Sports

Spain is passionate about sports, particularly football (soccer). The country is home to some of the most famous football clubs in the world, such as FC Barcelona and Real Madrid. Spain has also excelled in other sports, including tennis, basketball, and motor racing.

9. Welcoming People

Spaniards are known for their hospitality and friendliness. The strong emphasis on family and social connections makes visitors feel welcome and integrated into the local culture.

10. Historical Significance

Spain has played a pivotal role in world history, from the era of the Spanish Empire, which influenced large parts of the world, to its significant contributions to art, science, and exploration.

These factors, among others, contribute to Spain's reputation as one of the best countries in Europe. Its combination of cultural richness, natural beauty, and quality of life offers something for everyone.

3 User2 Why Spain is Great user Which city in Spain is the best?
4 User2 Why Spain is Great assistant The "best" city in Spain can be subjective, as it depends on personal preferences and what one is looking for in a city. However, a few cities stand out for their unique offerings, cultural significance, and overall appeal:

1. Barcelona

  • Cultural Attractions: Home to the works of Antoni Gaudí, including the Sagrada Familia, Park Güell, and Casa Batlló.
  • Beaches: Offers beautiful urban beaches like Barceloneta.
  • Vibrant Lifestyle: Known for its lively street life, festivals, and nightlife.
  • Cuisine: Famous for its tapas bars, seafood, and Catalan cuisine.

2. Madrid

  • Capital City: As the capital, it has a bustling, cosmopolitan atmosphere.
  • Museums: Houses world-class museums like the Prado Museum, Reina Sofia Museum, and Thyssen-Bornemisza Museum.
  • Royal Palace: Offers a glimpse into Spain's royal heritage.
  • Parks: Retiro Park and Casa de Campo provide green spaces within the city.

3. Seville

  • Historical Sites: Known for its stunning Gothic Seville Cathedral, Alcázar Palace, and the Giralda tower.
  • Flamenco: The heart of flamenco music and dance.
  • Festivals: Famous for Semana Santa (Holy Week) and Feria de Abril (April Fair).
  • Traditional Charm: Narrow, winding streets and picturesque neighborhoods like Santa Cruz.

4. Granada

  • Alhambra: The magnificent Moorish palace and fortress is a UNESCO World Heritage site.
  • Historic Districts: Albaicín and Sacromonte are rich in history and culture.
  • Student City: Home to a large student population, which adds vibrancy and affordability.

5. Valencia

  • Modern Architecture: The City of Arts and Sciences complex is a futuristic architectural marvel.
  • Beaches: Offers beautiful beaches like Malvarrosa.
  • Paella: Known as the birthplace of paella, offering the best of this iconic dish.
  • Festivals: Famous for Las Fallas, a spectacular festival with elaborate parades and fireworks.

6. Bilbao

  • Guggenheim Museum: A major draw for contemporary art lovers.
  • Revitalized City: Known for its transformation from an industrial city to a cultural hotspot.
  • Basque Cuisine: Offers unique and delicious Basque dishes.

7. San Sebastián (Donostia)

  • Beaches: La Concha Beach is considered one of the best city beaches in Europe.
  • Cuisine: Known for its pintxos (Basque tapas) and high concentration of Michelin-starred restaurants.
  • Film Festival: Hosts an internationally renowned film festival.

Choosing the Best City

Ultimately, the best city in Spain depends on what you're looking for:

  • For Art and Architecture: Barcelona
  • For Museums and Nightlife: Madrid
  • For History and Tradition: Seville
  • For Moorish Heritage: Granada
  • For Modern Attractions: Valencia
  • For Contemporary Culture: Bilbao
  • For Beaches and Cuisine: San Sebastián

Each city offers a unique experience and highlights different aspects of Spain's diverse culture and history.

text_data <- conversation |> 
  group_by(message.author.role) |> 
  get_keywords(token ="words", top = 5, remove_stopwords = T)

conversation |> get_keywords(token ="words", top = 5, remove_stopwords = T)
#>          word  n
#> 1       spain 16
#> 2 statistical 14
#> 3        city 13
#> 4        data 13
#> 5           r 13

About

An R package to scrape conversations from publicly shared ChatGPT links

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published