Cybercrimeology

1 Hundred: An AI assisted analysis of Cybercrimeology

Episode Summary

One hundred episodes of cybercrime, its research and its researchers. We take a look at the podcast, what have we done, what did we talk about, what goes into the production of the podcast and its impact on me as a host. This episode's guest is the future, or at least what it might be as we have a combination of AI technologies appearing as our research assistant.

Episode Notes

Summary:

The main points of this episode are:

Celebrating the 100th episode of cybercrimeology and reflecting on the podcast's journey over the past three years.
Discussing the use of new technologies, such as AI, for analyzing and understanding the podcast's content.
Analyzing the podcast's content using natural language processing and summarization techniques to identify recurring themes and research topics.
Identifying common themes in the podcast, including abuse in relationships, privacy invasion, law enforcement in cybercrime, social engineering, and age-related factors in cybercrime.
Discussing various research methodologies covered in the podcast, such as technographs, online experiments, and survey research.
Highlighting the dedication of guests who share their time and research without any financial incentives.
Answering questions about the process of creating each episode, including research, interviews, editing, and production.
Discussing the volume of work represented by 99 episodes totaling over 5 hours of content and involving 96 guests.
Reflecting on the impact of the podcast and its growth over the past three years, including achieving 100,000 downloads.
Looking forward to the future of the podcast and the potential for new technologies to enhance its content and reach.

About our guests:

Alloy:

https://platform.openai.com/docs/guides/text-to-speech

voicing generations from

ChatGPT

https://openai.com/blog/chatgpt

Papers or resources mentioned in this episode:

The BART model:

https://huggingface.co/docs/transformers/model_doc/bart

The DistilBERT model:

https://huggingface.co/docs/transformers/model_doc/distilbert

Results:

Which terms were spoken about the most and what was the sentiment around those ?

Noun	Occurrences	FilesOccurredIn	SentimentScoreSum
people	2529	94	92.60830581188202
time	1133	83	79.5210649
research	1396	80	79.49750900268553
way	1005	74	73.79837167263031
things	1238	73	72.45885318517685
lot	1117	71	70.87118428945543
data	903	46	44.24124717712402
kind	667	44	43.9891608
crime	885	43	42.725725710392005
cyber	805	41	39.68457114696503
cybercrime	481	38	36.90566980838775
thing	393	36	35.59294366836548
security	527	31	30.89444762468338
information	467	29	28.87013864517212

Was there a change in the sentiment of the podcast after the end of pandemic conditions, assuming that the pandemic ended at the end of Q3 2021?

The model is given by:

yi∼Normal(μi,σ)yi∼Normal(μi,σ)

where

μi=β0+βafter_event⋅xiμi=β0+βafter_event⋅xi

Here, the parameters are defined as follows:

β0β0: Intercept, with a Student's t-distribution prior with 3 degrees of freedom, a location parameter of 0.8, and a scale parameter of 2.5.
βafter_eventβafter_event: Coefficient for the predictor variable (after_event), with a flat prior.
σσ: Standard deviation of the response variable, with a Student's t-distribution prior with 3 degrees of freedom, a location parameter of 0, and a scale parameter of 2.5.

This provided the results as follows:

Population-Level Effects:

Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS

Intercept 0.37 0.06 0.26 0.48 1.00 3884 2917

after_event 0.39 0.08 0.23 0.54 1.00 3561 2976

Family Specific Parameters:

Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS

sigma 0.38 0.03 0.33 0.44 1.00 3608 2817

Other:

The model overlooked Mike Levi's contribution to the History series. That is a bit unfair.

Where there were multiple guests, I did not include them all in the database, hence "no specific guest listed"