Found 566 repositories(showing 30)
KaiDMML
This is a dataset for fake news detection research
entitize
r/Fakeddit New Multimodal Benchmark Dataset for Fine-grained Fake News Detection
yaqingwang
Dataset for paper "Weak Supervision for Fake News Detection via Reinforcement Learning" published in AAAI'2020.
RMSnow
Official repository to release the code and datasets in the paper "Mining Dual Emotion for Fake News Detection", WWW 2021.
sfu-discourse-lab
Datasets for fake news and misinformation detection
shiivangii
This repository contains code and dataset for the paper titled: SpotFake+: A Multimodal Framework for Fake News Detection via Transfer Learning.
TrustworthyComp
"MCFEND: A Multi-source Benchmark Dataset for Chinese Fake News Detection," presented at the ACM Web Conference 2024 (WWW' 2024).
LeadingIndiaAI
Fake news is misinformation or manipulated news that is spread across the social media with an intention to damage a person, agency and organisation. Due to the dissemination of fake news, there is a need for computational methods to detect them. Fake news detection aims to help users to expose varieties of fabricated news. To achieve this goal, first we have taken the datasets which contains both fake and real news and conducted various experiments to organize fake news detector. We used natural processing, machine learning and deep learning techniques to classify the datasets. We yielded a comprehensive audit of detecting fake news by including fake news categorization, existing algorithms from machine learning techniques. In this project, we explored different machine learning models like Naïve Bayes, K nearest neighbors, decision tree, random forest and deep learning networks like Shallow Convolutional Neural Networks (CNN), Deep Convolutional Neural Network (VDCNN), Long Short-Term Memory Network (LSTM), Gated Recurrent Unit Network (GRU), Combination of Convolutional Neural Network with Long Short-Term Memory (CNN-LSTM) and Convolutional Neural Network with Gated Recurrent Unit (CNN-LSTM).
ICTMCG
Official repository to release the code and datasets in the paper, "Integrating Pattern- and Fact-based Fake News Detection via Model Preference Learning", CIKM 2021.
thcheung
A curated list of datasets for fake news or rumor detection and analysis on social media.
shrebox
This repository contains supervised fake news detection on LIAR dataset. Check out the analysis details for more details.
raj1603chdry
Fake News Detection System for detecting whether news is fake or not. The model is trained using "Liar, Liar Pants on Fire": A New Benchmark Dataset for Fake News Detection. Link for dataset: https://arxiv.org/abs/1705.00648.
byinth
Dataset for fake news detection in healthcare domain
CRIPAC-DIG
[SIGIR 2022] Source code and datasets for "Bias Mitigation for Evidence-aware Fake News Detection by Causal Intervention".
SadeemAlharthi
One of the problems faced concerning Arabic fake news detection is the scarcity of Arabic datasets. We believe it is important to available a dataset is written in the Arabic language explicitly created for this domain. The existing studies of Arabic fake news detection are limited because most of the dataset has not been available to access for research. We have built for the purpose of fake news detection the dataset of Arabic Fake News Tweets.
tootouch
낚시성 기사 데이터 탐지 (2022, bflysoft & NIA) by DSBA lab.
gtraskas
A web app which interacts with a deployed SageMaker model performing detection on a fake news dataset.
alcorpas10
This is a spanish dataset for fake news detection
MohammadNuramin
spamming. The presence of spam on web services such as search engines, email providers or online networking services can be manifested in many ways including spam advertising, malicious links, fake news or fake friends but also manipulation attempts. For online social network, tracking and controlling spammers are of the upmost importance due to the security risk but also for the credibility of the information that they disseminate. The objective of this project is to study Twitter’s social spam by means of both data mining, machine learning and data analysis techniques (that you have learned so far in any course!) using a dataset containing information on 767 social spammers and legitimate users crawled from Twitter in November and December 2014 and July 2018. In this project you have firstly to solve the spam detection problem and secondly to analyse the dataset using methods presented during the lessons of data mining.
das-lab
This is a dataset for fake news detection research
waniashafqat
Fake News Detection using Bi-directional LSTM on ISOT Dataset.
Arko98
Multilingual Fake News Dataset created for the research paper "A Transformer Based Approach to Multilingual Fake News Detection in Low Resource Languages" accepted at the ACM Transactions on Asian and Low-Resource Language Information Processing (ACM TALLIP)
This repository implements the multi-modal deep learning arhcitecture published in 'SpotFake: A Multi-modal Framework for Fake News Detection' on Medieval-2015 dataset
Warishayat
This project focuses on fake news detection using machine learning and natural language processing (NLP) techniques. It analyzes news articles and classifies them as real or fake based on patterns in the text. Using algorithms like Naive Bayes, SVM, or deep learning, the model is trained on labeled datasets to identify misinformation and improve ne
daksh26022002
Fake News Detection Fake News Detection in Python In this project, we have used various natural language processing techniques and machine learning algorithms to classify fake news articles using sci-kit libraries from python. Getting Started These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system. Prerequisites What things you need to install the software and how to install them: Python 3.6 This setup requires that your machine has python 3.6 installed on it. you can refer to this url https://www.python.org/downloads/ to download python. Once you have python downloaded and installed, you will need to setup PATH variables (if you want to run python program directly, detail instructions are below in how to run software section). To do that check this: https://www.pythoncentral.io/add-python-to-path-python-is-not-recognized-as-an-internal-or-external-command/. Setting up PATH variable is optional as you can also run program without it and more instruction are given below on this topic. Second and easier option is to download anaconda and use its anaconda prompt to run the commands. To install anaconda check this url https://www.anaconda.com/download/ You will also need to download and install below 3 packages after you install either python or anaconda from the steps above Sklearn (scikit-learn) numpy scipy if you have chosen to install python 3.6 then run below commands in command prompt/terminal to install these packages pip install -U scikit-learn pip install numpy pip install scipy if you have chosen to install anaconda then run below commands in anaconda prompt to install these packages conda install -c scikit-learn conda install -c anaconda numpy conda install -c anaconda scipy Dataset used The data source used for this project is LIAR dataset which contains 3 files with .tsv format for test, train and validation. Below is some description about the data files used for this project. LIAR: A BENCHMARK DATASET FOR FAKE NEWS DETECTION William Yang Wang, "Liar, Liar Pants on Fire": A New Benchmark Dataset for Fake News Detection, to appear in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL 2017), short paper, Vancouver, BC, Canada, July 30-August 4, ACL. the original dataset contained 13 variables/columns for train, test and validation sets as follows: Column 1: the ID of the statement ([ID].json). Column 2: the label. (Label class contains: True, Mostly-true, Half-true, Barely-true, FALSE, Pants-fire) Column 3: the statement. Column 4: the subject(s). Column 5: the speaker. Column 6: the speaker's job title. Column 7: the state info. Column 8: the party affiliation. Column 9-13: the total credit history count, including the current statement. 9: barely true counts. 10: false counts. 11: half true counts. 12: mostly true counts. 13: pants on fire counts. Column 14: the context (venue / location of the speech or statement). To make things simple we have chosen only 2 variables from this original dataset for this classification. The other variables can be added later to add some more complexity and enhance the features. Below are the columns used to create 3 datasets that have been in used in this project Column 1: Statement (News headline or text). Column 2: Label (Label class contains: True, False) You will see that newly created dataset has only 2 classes as compared to 6 from original classes. Below is method used for reducing the number of classes. Original -- New True -- True Mostly-true -- True Half-true -- True Barely-true -- False False -- False Pants-fire -- False The dataset used for this project were in csv format named train.csv, test.csv and valid.csv and can be found in repo. The original datasets are in "liar" folder in tsv format. File descriptions DataPrep.py This file contains all the pre processing functions needed to process all input documents and texts. First we read the train, test and validation data files then performed some pre processing like tokenizing, stemming etc. There are some exploratory data analysis is performed like response variable distribution and data quality checks like null or missing values etc. FeatureSelection.py In this file we have performed feature extraction and selection methods from sci-kit learn python libraries. For feature selection, we have used methods like simple bag-of-words and n-grams and then term frequency like tf-tdf weighting. we have also used word2vec and POS tagging to extract the features, though POS
bksaini078
Training a neural network model from small size of high quality labeled dataset for fake news detection
This study introduces MultiBanFakeDetect, a novel multimodal dataset for Bangla fake news detection, combining textual and visual information. It features TextFakeNet for text analysis and MultiFusionFake for integrating multimodal data.
RishiHazra
Fake News Detection on Liar dataset
div5252
Fake News Article Detection Datasets for Hindi Language
sonalgarg174
This is a dataset for fake news detection research