News Dark Web ChatGPT Unleashed: Meet DarkBERT

Interesting take on the dark web. I hope this is successful and they crack the whole dark web, which to be honest it's very hard to understand how it really works.

Btw, the exact same group of researchers last year worked on another paper/thesis dubbed as, ‘Shedding New Light on the Language of the Dark Web,’ where they brought forward and introduced CoDA (text corpus of the dark web collected from various onion services divided into topical categories).

CoDA according to them as per definition is a publicly available Dark Web dataset consisting of 10000 web documents tailored towards text-based Dark Web analysis. Here is the paper:

 
Ah yes..."AI" language models...because it's working out so great right now /s

376×168 jpg
15,9 kB

Screenshot-2023-05-16-at-10-25-45-AM.jpg
 
Last edited:
I guess the lives of cybercriminals, child traffickers and drug dealers just got a little more interesting. But since the CIA and NSA have already been processing petabytes of data for actionable intel using AI, I wonder how their model compares.

And when action is taken against these targets, it gives the idea of a 'loss function' more meaning.
 
The paper also pointed out detail on how much data they fed DarkBERT, including a table that details every site and category it was filed under.

https%3A%2F%2Feditors.dexerto.com%2Fwp-content%2Fuploads%2F2023%2F05%2F17%2Fstats-darkbert-1024x576.jpg
 
Last edited by a moderator:
I'm not sure how I feel about the phrase "the anonymyzing firewall of the Tor network".

Isn't it the world's worst-kept secret that TOR is operated by the CIA or am I just blindly believing rumors?

I'm not being pedantic, this is a genuine question for anyone here who knows more about this than me, which is likely the majority of you guys.
 
You might find the actual history of the Tor Project to be interesting. It is now a non-profit organization. As to privacy and anonymity, the engineering works. It does get attacked by nation-states, but mostly holds its own such that it is extremely difficult to pick a target and deanonymize it. Bridges solved the problem of countries blocking Tor server IPs.

Carelessness is the main way lawbreakers get caught. It is more effective when more people use Tor for access to legitimate onion sites, such as newspapers, blogs, wikis, and so on. It is so effective that it is the main reason people have said that censorship is dead. Even China's GFW cannot defeat onion routing as deployed with bridges.