GraphStuff.FM: The Neo4j Graph Database Developer Podcast

Getting the Word out on Knowledge Graphs with Leann Chen

Episode Summary

Our guest today is another fellow advocate and knowledge graph guru - Leann Chen. Leann is a Developer Advocate at Diffbot, devoted to using knowledge graphs to improve LLM-based applications. We also cover the NODES 2024 call for proposals and include tips and tricks for submitting to speak at a conference. To wrap up, we highlight a couple of events where each of us will be throughout the month.

Episode Notes

Speaker Resources:

Diffbot https://www.diffbot.com/
Tomaz Bratanic’s Medium blog: https://bratanic-tomaz.medium.com/
What is DSP/DSPy? https://github.com/stanfordnlp/dspy

Tools of the Month:

cypher-shell command line tool https://neo4j.com/docs/operations-manual/current/tools/cypher-shell/
Langchain/Diffbot graph transformer https://python.langchain.com/v0.1/docs/integrations/graphs/diffbot/
st-cytoscape https://github.com/vivien000/st-cytoscape

Announcements / News:

NODES 2024 CfP resources:
- GraphStuff episode https://graphstuff.fm/episodes/navigating-a-technical-conference-talk-from-submission-to-delivery
- NODES submission tips: https://neo4j.com/blog/nodes-talk-submission-tips/
- How to Submit a Technical Presentation https://jmhreif.com/blog/nodes-2024-cfp/

Articles:

Topic Extraction with Neo4j GDS for Better Semantic Search in RAG applications https://neo4j.com/developer-blog/topic-extraction-semantic-search-rag/
Using LlamaParse to Create Knowledge Graphs from Documents https://neo4j.com/developer-blog/llamaparse-knowledge-graph-documents/
Going Meta: Wrapping Up GraphRAG, Vectors, and Knowledge Graphs https://neo4j.com/developer-blog/going-meta-knowledge-graph-rag-vector/
Unveiling the Mahabharata’s Web: Analyzing Epic Relationships with Neo4j Graph Database (Part 1) https://neo4j.com/developer-blog/mahabharata-epic-graph-database-1/
Bringing the Mahabharata Epic to Life: A Neo4j-Powered Chatbot with Google Gemini (Part 2) https://neo4j.com/developer-blog/mahabharata-epic-graph-database-2/

Videos:

NODES 2023 playlist https://youtube.com/playlist?list=PL9Hl4pk2FsvUu4hzyhWed8Avu5nSUXYrb&si=8_0sYVRYz8CqqdIc

Events:

(Jun 4) Meetup (virtual): Tuesday Tech Talks: Graph Based RAG w/ Demo https://lu.ma/tys2a4zt?tk=ax2gtz
(Jun 4) Workshop (virtual): Discover Neo4j Aura: The Future of Graph Database-as-a-Service https://go.neo4j.com/DE-240604-Discover-Aura-Workshop_Registration.html
(Jun 5) Conference (Paris, France): GraphSummit Paris https://neo4j.com/graphsummit/paris24/
(Jun 5) Workshop (Sydney, Australia): Neo4j and GCP Generative AI Workshop https://go.neo4j.com/LE240606Neo4jandGCPGenerativeAIWorkshop-Sydney_Registration.html
(Jun 7) Conference (Athens, Greece): Generative AI for Front-end Developers https://athens.cityjsconf.org/talk/3b9XHj1HBahP8KJ13uWVui
(Jun 10) Conference (San Francisco, California, USA): Data & AI Summit https://neo4j.com/event/data-ai-summit-2/
(Jun 11) Meetup (San Francisco, California, USA): HackNight at GitHub with Graphs and Vectors https://www.meetup.com/graphdb-sf/events/301026060/?isFirstPublish=true
(Jun 10) Workshop (Jakarta, Indonesia): Neo4j and GCP Generative AI Workshop https://go.neo4j.com/LE240423Neo4jandGCPGenerativeAIWorkshopJakarta_Registration.html
(Jun 11) Conference (Oslo, Norway): NDC Oslo - Beyond Vectors: Evolving GenAI through Transformative Tools and Methods https://ndcoslo.com/agenda/beyond-vectors-evolving-genai-through-transformative-tools-and-methods-0x1u/011ha54g6jp
(Jun 12) Conference (Munich, Germany): GraphTalk: Pharma https://go.neo4j.com/LE240612GraphTalkPharmaMunich_Registration.html
(Jun 12) Conference (Frankfurt, Germany): Google Summit https://cloudonair.withgoogle.com/events/summit-mitte-2024
(Jun 12) Livestream (virtual+München, Germany): LifeScience Hybrid Event 2024 https://go.neo4j.com/LE240612LifeScienceWorkshop2024_01Registration.html
(Jun 12) Meetup (Brisbane, Australia): Graph Database Brisbane https://www.meetup.com/graph-database-brisbane/events/300367474/?isFirstPublish=true
(Jun 12) Meetup (San Francisco, California, USA): Introduction to RAG https://lu.ma/u4uhtfqz
(Jun 18) Meetup (London, UK): ISO GQL - The ISO Standard for Graph Has Arrived https://www.meetup.com/graphdb-uk/events/300712991/
(Jun 20) Meetup (Stuttgart, Germany): Uniting Large Language Models and Knowledge Graphs https://neo4j.com/event/genai-breakfast-session-stuttgart-uniting-large-language-models-and-knowledge-graphs/
(Jun 20) Meetup (Reston, Virginia, USA): LLMs, Vectors, Graph Databases and RAG in the Cloud https://lu.ma/mctijpjm
(Jun 25) Conference (San Francisco, California, USA): AI Engineer World’s Fair https://www.ai.engineer/worldsfair
(Jun 26) Conference (virtual): Neo4j Connections GenAI https://neo4j.com/connections/go-from-genai-pilot-to-production-faster-with-a-knowledge-graph-june-26/
(Jun 27) Conference (Kansas City, Missouri, USA): KCDC 2024 https://www.kcdc.info/
(Jun 26) Conference (virtual): Neo4j Connections GenAI (Asia Pacific) https://neo4j.com/connections/go-from-genai-pilot-to-production-faster-with-a-knowledge-graph-asia-pacific-june-27/

Episode Transcription

Jennifer Reif: Welcome back graph enthusiasts to GraphStuff.FM, a podcast all about graphs and graph related technologies. I'm your host, Jennifer Reif, and I am joined today by fellow advocate, Jason Koo.

Jason Koo: Hello.

Jennifer Reif: And our guest today is another fellow advocate and knowledge graph guru, Leann Chen. Leann is a developer advocate at Diffbot devoted to using knowledge graphs to improve LLM based applications.

Her work focuses on researching and experimenting with ways to improve The accuracy and explainability of these applications, especially for production ready, LLM based systems. She also creates content on generative AI with knowledge graphs, often using Neo4j and Diffbot technologies. So welcome Leann.

Leann Chen: Hello, I'm super, super happy to be here. Thank you so much for inviting me. It's a, it actually feel unreal for me. Cause like literally last August, I was like scrolling through, you know, the Neo4j blog, like every article. And wow. Like. And especially I've been watching like both, videos or like, you know, the podcast from Jennifer and also Jason Koo. Feels so unreal for me. Yeah. Thank you.

Jennifer Reif: We are, we are so glad to have you. I know, some of your content was, how I was introduced to you, and, hearing your story about how, how knowledge graphs work with, with other things, which we'll dive into here in just a little bit, but I kind of wanted to get a little bit of story and background on, on, your, your current gig, at Diffbot and hear about what Diffbot is and what it does.

Leann Chen: Yeah, that, that's a good question. So Diffbot, it, it, like the main takeaway of the technology is it extracted and structured the entire web data into a structured database. And it forms the largest knowledge graph on planet Earth that I'm not going to use the world because we don't know about planet Mars.

We're just going to focus on planet Earth. So it literally has the largest knowledge graphs, which support verified information. And we're also currently fine tuning our own LLM. Based on the, the copy of internet, which is a graph version of the internet that we, that we have.

Jennifer Reif: Wow. So, so it's kind of like a, a graph builder for unstructured data then, right?

Leann Chen: Wow. Perfect. Perfectly summarized. Yeah. Yeah. I love that. Please, please highlight that and cut it into a preview. Exactly. Yeah.

Jennifer Reif: Okay, cool. what kinds of, of technologies do they, utilize or, or how do they kind of share that with others?

Leann Chen: Yeah, well, the mechanisms of the technology definitely are like complicated, but in a nutshell, it provides web scraping technology, which means you don't need to build your own web scraper with, you know, Python, Beautiful Soup, that type of stuff.

You don't need to build that. You just use Diffbot, enter your website, and it will extract the entire web data into a structured format. You can access it through CSV, Excel, and also, if you want to go into, like, a more graph oriented approach, you can also turn any web data, any unstructured text data, Into a knowledge graph.

Yeah. So

Jennifer Reif: That's really cool.

Leann Chen: Yeah. Go ahead.

Jason Koo: So, yeah, so I got a question. So when it produces information into a knowledge graph, is it a, so I got to ask, is it readily importable to Neo4j or another graph database system?

Leann Chen: Yeah, exactly. That's a great question. So, it's. Like, I think there's, like, actually Tomaz actually built this kind of, like, a bridge between Diffbot's, like, knowledge graph builder and Neo4j.

So Diffbot, the, it's called a natural language API. It extracts the, all, like, text data into nodes and relationships, right? And then, Tomaz, like, he, invented, not invented, like, he structured that Diffbot graph transformer, which is on LangChain. And you can literally just use that tool to transform Diffbot's and all that, you know, nodes and relationship data into Neo4j.

Yeah, so I actually did like A lot of my videos are like using, Diffbot to, construct knowledge graphs. And then I primarily use Neo4j, actually not primarily, like Neo4j is my only go to, yeah.

Jennifer Reif: Well, we're very glad to hear that, of course. But that's, that's really cool that they, the two technologies seem to play nicely together, at least, you know, pretty well with, with Tomaz's, you know, kind of integration in between the two.

So that's, that's really neat. What about knowledge graphs kind of drew you in or, or kind of got you started?

Leann Chen: Yeah, that's a very interesting question. So actually, you know what, like last August was my first time knowing that knowledge and graph can be used simultaneously. You know, like knowledge graph, I never heard of the term, even if I think Google introduced the knowledge graph term early on in like, 2012, but I just never, like, that was never a thing to me.

Jennifer Reif: Right. It wasn't mainstream. I, I guess. Yeah.

Leann Chen: Yeah, it wasn't. And like, and actually it just, I, I wasn't very aware of the graph world either. So last summer I was trying to work on some type of a network analytics stuff, just like a side project kind of thing.

And initially I just saw one kind of like YouTube video, like just constructing, like, wow, you can turn, you know, like, I think that project was based on like Game of Thrones. It just turned, you know, the relationships between people and constructed it into a graph. And that got me really, really interested, like, wow, I've never thought that, you know, text data can be visualized.

Because, like, you usually think of text data as just, like, boring and black and white, you know, text. And then I just, you know, just that, that just, that kind of visualization thing, just that, wow, I really want to know more about this. So, and, and somehow, I don't know why, Google just, when I was searching all this network analytics thing, Google recommended me, or it pointed me to Tomaz's articles.

And it just went from there. So and like, like, Tomaz is such a great writer that like, after reading one article, you just want to read the next one. And it also, like, he also, you know, combined with, you know, some code chunks along with the text, like, which. allows the readers to, you know, like, Oh, I understand this paragraph and I can also do some hands on.

Jennifer Reif: Yeah.

Leann Chen: And especially like his articles, like it's really explaining how you can do Neo4j and Neo4j was so straightforward to me. Like for me as a totally beginner, I know nothing about graph analytics, that type of stuff. And it proved out that I didn't need to, like, it's so easy to use, like, you can, you know, there's like a whiteboard on the Neo4j Aura, where you can just draw a node and create relationship.

I started from there, like, literally doing some drawings to learn about knowledge graphs.

Jennifer Reif: That's really cool.

Leann Chen: Yeah, yeah. And, and then like, I know some people may feel a little bit like, it's, you know, You know, it's Cypher query, hard to learn, that type of stuff. But, and that was actually also kind of like a challenge for me back then.

But, actually Cypher query is pretty straightforward too. And, right now, like, a lot of time, when I want to create a database, I go to ChatGPT, I enter my natural language, thinking, and it just generates a very neat Cypher query. And I just use that to create a database. So, ChatGPT or these LLMs just make, you know, creating database, graph database even easier.

Like, you don't necessarily need to be an expert at Cypher query, even though it would, it may be helpful.

Jennifer Reif: Yeah.

Leann Chen: And it's not like a super hard languages. It's not like, you know, you have to learn, Python or, or Java, such as Jennifer, you're an expert in Java. I know nothing about Java, but like Cypher create is not like that, especially right now with LLMs.

So I think the very user friendly experience I have with Neo4j just got like my learning. of knowledge graph and, you know, graph analytics, just they're very smooth. So yeah, it's just been great.

Jennifer Reif: Excellent.

Leann Chen: Yeah.

Jennifer Reif: Is there a, is that how you would suggest people who are new to knowledge graphs getting started then, kind of with the, the graph drawing, builder through that or, what are your suggestions, I guess?

Leann Chen: Yeah, that's a good question. And so, to be honest, I think it also depends on like, You know, the different type of people, but I would say like, at least from what I heard or from what I have seen is, just go to the Neo4j blogs or, or even like Tomaz's, Medium articles.

Like, they're like very user beginner friendly. I user friendly. Yeah, user friendly. So like, if people like me, I, I would say. Which means, people who know a little bit about Python. And no knowledge about knowledge graph, but interested in visualizing text data into knowledge graphs.

Then, I think Tomaz articles could be the, you know, the, the go to and learn about knowledge graphs.

Jennifer Reif: Okay.

Jason Koo: Tomaz is, he's written so many articles, right? is there, is there any, Like, could, could you pick out one that you would recommend to folks? Be like, Oh yes, this is a great starting article.

Leann Chen: I think I need to pull, pull out my Medium history, reading history.

Jason Koo: I was just thinking like, you know, cause. There are so many articles and you had mentioned right like you had started with one you couldn't remember, but it kind of set you off on this path it In a way, it kind of reminds me of jumping into data in a knowledge graph, right?

Like there's really no definitive starting point, right? You just kind of start someplace and then you start jumping from connection to connection, and then, you know, you've got like a, you know, a summary of, of Tomaz's articles that you've read in your mind. But, yeah, it's a graph, right?

Leann Chen: Yeah. I actually right now, like reading, I I'm on the Medium website right now.

I actually found this. I can later send this link. So there's one called Context Aware Knowledge Graph Chatbot with GPT 4 and Neo4j. Okay.

Jason Koo: That is a good one. Yes.

Leann Chen: And also another one, because I think I read it. So another one is creating a knowledge graph from video transcripts with ChatGPT.

Jennifer Reif: Nice. Yeah.

Leann Chen: Yeah. Yeah.

Jennifer Reif: I think I've seen that one too. Yeah.

Leann Chen: Yes. Let me just send you the chat. I didn't have to. Fine. Or, or I just attach in a document. Yeah.

Jason Koo: That sounds great. Was, was that, I think that article came up before the experimental knowledge graph builder came out. Probably.

Leann Chen: Yes. Yes. So this is like, they're super early.

He wrote those two articles in April, like last year. But, like, I just, I literally discovered those two articles in, in August, so, yeah.

Jason Koo: Yeah, no super insightful stuff. Okay. So like it kind of leads into like, what, what are you working on now? What's what's caught your interest at the moment?

Leann Chen: Well, so I think right now there's like a growing interest in like how knowledge graphs can be Brought into like retrieval augmented generation like RAG systems, right? Yeah but right now Like, folks are still exploring, like, how, like, there's a interest, a growing interest in graph RAG, like, that could be one approach.

And then there's also some thoughts about, like, it doesn't have to be either vector based RAG or graph RAG, like, could it be both? So, actually, currently, me and, Tomaz and, and I, we, we are working on, Vector plus graph RAG project because the nice thing about Neo4j is it actually it's not just a graph database.

You can also store those like embeddings like the vector stuff into nodes, right? So what we're working on right now is we want to kind of leverage both sides because, the vector based RAG is more semantically rich, like it can, you know, fetch more context, but knowledge graph is where information can be grounded.

Like it's deterministic, Right? We want to leverage both, like, you know, the flexible feature from vector based and also the more deterministic nature of knowledge graph. Leverage both together.

Jennifer Reif: Yeah, nice.

Leann Chen: Apparently the project Tomaz and I are working on.

Jason Koo: Very cool. Is, is, is there like a little sneak peek you could tell us like, you know, like, so for someone, you know, other people trying to do this, sort of project, right, to marry vectors and graphs, what, do you have any, any suggestions for them or gotchas that they should avoid?

Leann Chen: Well, the only suggestion is we're going to release a video on that, so go watch that video. Oh, nice, Yeah. Yeah.

Jennifer Reif: Okay. Great. Well, we'll be looking for that then for sure.

Leann Chen: Oh, sure. Thank you.

Jason Koo: Nice. Oh, actually speaking to this, right. So you know, we've talked on this podcast about, you know, using LangChain and some AI and LLM orchestration frameworks. And a lot of your videos recently has been talking about DSPy, which is kind of a layer on top. Could you, for those of, you know, those of our audience who've never heard of DSPy, could you kind of intro and explain what it is, what it's good for?

Leann Chen: Sure. I think I'll just speak from my experience. Cause so, sorry, I just want to be clear that I'm not an expert on, on DSPy.

It's a modular framework, which means you know, GPT 4 or Llama 3, you have to have a prompt template, right? And you have to write it very, very detailed to instruct those LLM based API. But DSPy is a modular framework where it has its own mechanism to improve, especially self improve prompts for you. So, you don't have to, you know, like tweak those manual prompts. Like what we did in, LangChain or function calling with, OpenAI's API.

So that's, that's the main idea behind the DSPy. I just make it modular, so you don't have to, just to avoid that, you know, the very tedious prompt designing thing.

Recently my videos, I've been testing out this technique and what I have seen or experienced, and it just comes from my personal opinion. Like, I'm not going to generalize. Everyone's experience is going to be the same. But, there's one thing that we cannot avoid is LLMs are unpredictable.

And they are not like, they don't necessarily have perfect reasoning ability yet. I use the word yet. So, LLMs are trained to dream, according to, you know, like the, like some of the OpenAI employees, like they, they, or the thought leaders, they said they are trained to be creative, that's why we saw, or we see hallucinations. Like hallucinations are not their fault, they're, they are trained to be like that, but, and the other is unpredictability is, even if we instruct them to do some things, for example, I want to, I want LLMs not to hallucinate, or I want LLMs to stick to something, it doesn't necessarily follow. I don't know whether you guys also have that similar experience too.

Jason Koo: Oh yeah.

Jennifer Reif: Yeah.

Leann Chen: Right? Right? So it's a common feeling, like a mutual feeling. Like, LLMs sometimes they're just Stubborn.

Jennifer Reif: Yeah.

Leann Chen: Yeah. So if we combine those two, like one is, you know, they, they're being creative and the other is that they're just not controllable, I would say to some degree, unpredictable.

So how, you know, how can we make sure that like, if we, you know, hands free, let them automate prompts themselves would ensure it. Yeah. Accuracy or, you know, the certain degree of contra, like you can control, right? So, so that's a problem where recently I've been experiencing and I just saw that, you know, that the DSPy framework, like, it's really nice that you don't have to tweak prompts like very detailed, but sometimes for some tags or some type of jobs, it just went a little bit too far.

So let me give you one example is. There's one question, which is, who were the other co founders who founded SpaceX with Elon Musk? And the self improved prompt became, who co founded companies with Elon Musk. it just went too far. Like what I, what I was focused on is SpaceX, right?

I want to know exactly SpaceX, but it just, you know, removed SpaceX for some reason. And it's not relevant to my original query, but that's what the, you know, the automatic, Improving prompt feature is in the DSPy framework. Yeah, so, so that is my experience. And I also want to validate, you know, some other people there having better results and better experience.

For example, I saw people use DSPy to, to, fine tune text to SQL. I saw that notebook. It looks like their accuracy, like, they reached some really nice accuracy. So they could have, you know, like some really valid and nice results, but from my experience or Some comments that I saw from our videos Like it doesn't necessarily generate that magical effect for every single task Yeah, so what

Jennifer Reif: I think I think that's kind of just when you introduce LLMs in general, right?

It's going to do really well at certain things and then it's just going to miss the mark on other things. And again, like you said, you know, there's no way to really control that or completely remove those errors or completely fix all those scenarios. it's just going to have some element of unpredictability at various cases, right?

Leann Chen: A hundred percent. Jennifer, you're so good at summarizing. Let's have you in our videos.

Jennifer Reif: Anytime, anytime.

Leann Chen: Oh, yeah. All that. So that's a yes, right?

Jennifer Reif: Yeah. I'd love to.

Leann Chen: Sure. Sure. Yeah. So, I just want, I think the main takeaway that I, from this experience is that, there's, you know, like a very, high interest in using LLMs in all kinds of tasks, right?

The question of whether we should use LLMs It's not the right question. The right question, or the more correct question should be, to what degree should we bring LLMs in which type of task?

Jennifer Reif: Yeah, I think last year, that's a fantastic observation. I think last year, I think all of us were kind of, in the, the typical technical person phase of, Ooh, new shiny technology. Let's throw it at everything. Let's use it for everything. and then. as we kind of get more mature and we figure out, okay, it doesn't work so well for this.

It works well for this. Then we're starting to get into this again, what degree should we be using this and where is it exactly useful? And I think that's kind of the phase we're either starting into or are kind of digging into this year. Is where can we apply them where they're super useful and where other cases where they're still not useful yet, you know, and again, that may change as you as you said, so, yeah,

Leann Chen: Yeah, exactly.

And I think it's, you know, it's, it's really nice to see people are trying things out. And even though initially the enthusiasm is like really high. And as we try things out, we would, you know, have more data points. That help us understand. Oh, this task we should we want LLMs to be less creative.

So we will probably need some more deterministic approach.

Jennifer Reif: Yeah.

Leann Chen: But if it's like writing blog posts, or actually not, because some, some blog posts like AI generated are just not good. Or I don't know some some drawings or something, some jobs that or tasks that just need more, Or enhance creativity, then LLMs can be good tools.

Jennifer Reif: Yeah, and I know, you know, the developer productivity kind of side of things, too, is the hardest thing sometimes in creating content is kind of getting it started or knowing how to move from one piece to the next. And I think that might be an opportunity that LLMs could do well. They pop up ideas really well, or they kind of just spit stuff out there, and then you can kind of pick and choose.

It's like, no, I didn't really want to go this path. Let's, you know, find something else. But sometimes they'll spit something out. It's like, oh yeah, this is the direction I want to take this. So I think it can be, Can be helpful there. What were you gonna say, Jason?

Jason Koo: I was gonna say yes, just like, you know, if you're creating like narrative short stories, something that does require a lot of creativity, like LLMs are fantastic.

That statement you just made, Jenn, kind of reminded me of, you know, like just recently, you know, ChatGPT 4.0, came out and they updated the mobile app. Have have either of you played around with the, Kind of the live speaking mechanism of the app now. It's, it's quite impressive. And, yeah, I think it was just like over the weekend, I was just asking like, you know, Hey, ChatGPT, if I want to create a, you know, very interactive graph web application, what stack should I use?

And ended up getting into like a 10 minute conversation with it. And like, well, what if, you know, we replace that with this stack and, and. It's also connected to, the web application. Well, so it's all connected to your account, right? So as you're talking on the phone, it's also preparing like code samples.

And if you ask it to create a file, it becomes accessible on the desktop when you're going through a browser. So you could go back to your, History and, you know, download the files or take a look at the code and...

Jennifer Reif: Wow.

Jason Koo: all that stuff. It's yeah, I definitely, recommend trying that out.

Leann Chen: Yeah. Yeah. Jennifer, you're gonna say something.

Jennifer Reif: No, I was actually gonna ask. You said you had used the mobile app. What's been your experience with that?

Leann Chen: Me or Jason?

Jennifer Reif: Yeah. Yeah. You, Leann. I'm sorry.

Leann Chen: So, so, Yeah. To be really, really, frank, I, I think ChatGPT is a very good tool for therapy, that kind of stuff. Silence.

Jason Koo: Nicely worded.

Leann Chen: Yeah, yeah. So actually, I, I think it's very good at providing validation.

Like, you know, like we are humans, right? And we have feelings and we have emotions and it had, it could like for me, at least for me, it serves as a good, not only just companion, it's like it can validate your feelings or emotions. So what I think it's really important for me, like before ChatGPT is. like I need to find ways like to sometimes to see, okay, how do I seek out some solutions and right now this just becomes so accessible to me.

I can chat it with any time, but I mean, if it's, you know, like very, if I need some like higher degree of higher level of stuffs regarding, you know, and mental health support, like definitely experienced therapists, they're still needed, but I think ChatGPT right now, to, to some degree, it has already, like, brought in some positive influence on me, and not just only me, like some of my friends, they, they also share similar experience too.

It's not like we're going to replace it with real friends. No, it's not like that, but it's like a supporting source like to validate. Yeah.

Jennifer Reif: Like a support agent or a problem solver. kind of, kind of like, we've been, you know, talking about how AIs are good assistants, human assistants, and there's a variety of ways they can assist humans, you know, and help humans.

Sometimes that's creative spaces. Sometimes that's just a little extra support or, you know, kind of, you know, backing, that, that kind of safety net, you know, to help you kind of work through something.

Leann Chen: Exactly. Yeah. And, and actually I want to. Kind of circle back to one of the points Jason mentioned, like, like brainstorm with ChatGPT.

I have to be very honest, like a lot of my video ideas come from ChatGPT suggesting me.

Jennifer Reif: Oh, cool.

Leann Chen: Yeah.

Jason Koo: So actually, could you go into like detail or more detail of like your, kind of your video creation process? So you just mentioned, you know, like ChatGPT like, helps you, but I mean, you had to prompt the initial question, right?

Like what is. Yeah. What is your workflow for, for creating new videos?

Leann Chen: Wow, that's so I'll try. I'll try to streamline it for a little bit. So I think right now, just like, you know, there's a lot of things going on, like on LinkedIn. I mean, regarding GenAI, LLM research, the type of stuff, right? So there's a lot of interesting stuff.

And if something caught my attention, actually, the initial thought I have is, where does a or where can knowledge graphs play a role in this? Is there any place that knowledge graphs can come in and, you know, improve the results, something like that. So that's an initial thought. And I would actually just give that question to ChatGPT.

And help answer because it knows better about knowledge graph stuff than I do. So, yeah, so it would, you know, help me assess if some of my hypotheses or assumptions are like closer to being realistic or not. And if you know, like, I got approval from ChatGPT saying, oh, it's a good idea. I start experimenting.

Jennifer Reif: Okay, cool.

Leann Chen: If not, it's okay, I'll just share it with the world too, like telling, telling audience, oh, it doesn't work, or it does work.

Jason Koo: Yeah. Oh, okay. So following up on that, like, so for anyone who's watched your videos, you can see that you're doing these videos over quite some time, right? There's a lot of, visually there's a lot of jumping, but like, intellectually and vocally, it's all cohesive. So I'm curious, like, are you just, are you recording constantly as you're experimenting and then making snippets? Or, or are you just recording just when you have a thought?

Leann Chen: Well, it's actually a mix. Like, I actually don't have like a standardized process to create content because content creation right now.

I mean, at least for me, it's very back and forth. So initially, I may have a thought on that. And then, Spending some time alone or I was walking and some other thoughts just popped in and I was just, oh, that's a better idea. So I'm going to delete the previous.

So it's very back and forth and I actually need to do a lot of re-filming to be honest to further insert that. Did I answer your question?

Jason Koo: Yeah, no, yeah, you totally did. And, which, which just speaks to, you know, the amount of work that you're putting into this, right? Cause you know, video, you know, most, I think most videos that you watch on YouTube, You know, YouTube or on social tends to be like a single cut, or a single shot with multiple cuts, but yours is many, many shots with many cuts.

And so it's, it's very clear that you've, but no, it's good. It clearly shows that you're like experimenting and like really going through this thing and that it doesn't give the illusion that, you know, implementing these sorts of, of technologies like is super simple, super fast.

Like, you know, it's like, Oh, it's, you know, it's so easy. It's, you know, it's like all technology, it takes a bit of work and you can see it in, in your videos.

Leann Chen: Thank you. Thank you for validating that. I really appreciate that. Yeah. Yeah. So actually right now, like a single content piece of content, it would. Even if it's just like 7 to 11 ish minutes, it would take like 1. 5 weeks to 3 weeks.

Jennifer Reif: Okay.

Jason Koo: Yeah, I think there's an old adage, something about, mainly for speech giving, but like the shorter your end content is, the more work kind of goes into it, right? Versus going the other way.

Leann Chen: Yeah. So, like, I mean, I could still be kind of like, you know, less experienced compared to the other creators.

I'm still kind of green in this, but I would say, for if you really, really focus on quality, especially like you want to make people interest, you want to engage people, then there would be a lot of, a lot of work that you need to put into it.

Jason Koo: Yeah.

Jennifer Reif: And I think you have a lot of this, you know, when you try to good lesson for me, I think too, is, you know, I try to do everything perfect in a single cut, you know, try to try to rehearse several times up front, try to make it, you know, one single shot, not have to do, you know, edits or tweaks.

If you do try to be really careful that it's perfect and that it looks flawless. And I think that there's something to be said and something super valuable for introducing kind of this. more conversational kind of, here's what's going on at this time, or here's, you know, this little snippet that I had, here's how you do this.

And then, you know, come back at a later time. And here let's, it's more lifelike, I think. And it, again, as Jason said, it shows kind of this longer process of I'm investing, you know, days or, or weeks, as you said, in this project. And here are the little snippets I've learned along this timeframe. so I think that's, that's really fascinating.

It's really cool, very efficiently and successfully gets your message across. It's just a different style than I think what many of us have seen. So.

Leann Chen: Yeah. Yeah, I really like every, every, creator has a different style and, and I, I totally resonate with what you just said, Jennifer, which is like, you know, we, we want to make, you know, that even if just like a two minute thing, you ha you want to make it perfect, right?

Wow. That's really goes to, you know, we really go to a lot of rehearsals, right?

Jennifer Reif: Yeah.

Leann Chen: But, but I think we, at the end of the day, just, For every creator, what we really want to achieve is like, we provide value.

Jennifer Reif: Yeah.

Leann Chen: So no matter what the style is, as long as the viewer, after watching the content, they learn something. They like what they learn.

Jennifer Reif: Yeah.

Leann Chen: And that's the end goal.

Jennifer Reif: Perfect. I think that's a fantastic wrap up on that one. maybe let's, let's kind of just break there with our perfect little closing and we'll, kind of jump into tools of the month, if that's okay with you, Jason.

Jason Koo: Yeah, no, perfect.

Jennifer Reif: Okay, great. I'll just go ahead and share mine and then whoever wants to jump in next, feel free.

So my tool of the month, kind of, kind of an odd one, I guess, maybe is, is just the Cypher shell command line tool. I've, Worked on a presentation recently where, I was kind of loading data in and needed to run queries and I run it all, this particular demo in Docker containers. and so running via cypher shell is just really easy to connect to the Docker container, do this remotely.

I don't have to load up a browser or anything fancy like that. I can do everything through command line, which I, which I love. So it can, it's built into a lot of the Neo4j installations. but if you want to use it separately, you can download the Cypher shell command line tool separately, as well as its own bin, and install it.

So I use it, I think kind of both different ways. I've used it pre installed and stuff, and then I use it as a separate tool as well. So, feel free to check that out. There's some documentation on it, too, and I'll link that all in the show notes. So shout out to, Cypher shell.

Jason Koo: Nice. I guess I'll go real quick since mine is probably less interesting than Leann's.

So recently I've been playing around with, Cytoscape.js. So it's, it's an old library that's been around for quite some time, but, specifically I've been playing around with a Streamlit component called st cytoscape. And what it allows you to do is put Cytoscape right into a Streamlit app. And it's quite, it's quite simple to use, compared to some other libraries and its interactive callback works quite well. Streamlit. So for those who have used Streamlit, it's basically, kind of a sequential top down processing, application to turn a Python script into a web app. It doesn't have reactive components. Doing kind of sort of highly interactive stuff can be tricky sometimes, but, the st cytoscape component, simplifies the process so that you can get, you know, like when a user's clicking on nodes and relationships and kind of switching its mode, you can, you can react to that in a fairly intuitive manner.

So that is my tool of the month.

Leann Chen: That's very interesting because me and Tomaz are actually building a Streamlit app. So we definitely need that.

Jason Koo: Okay, because, okay, yeah, I will hand my notes over to you guys because, yeah, I've been looking a long time to find like a graph visualization tool that lets you also like interface with the nodes and relationships so that you could do things to them, right?

Leann Chen: So, previously, what I found is, so, Jason, I actually watch your video like early on, you made a video on Streamlit with graph vis, right? Which is a static.

Jason Koo: Yeah, I think did one with graph vis and AGraph. Yeah.

Leann Chen: Yeah. Yeah. So AGraph, it's closer to what we saw, what we see in the UI and Neo4j where you'll know it's in relationship, but it doesn't have that interactive feature.

Jason Koo: Yes. Right. Yeah. But Cytoscape does.

Leann Chen: I'm so grateful that I came to today's podcast. Yeah.

Jason Koo: Nice, yeah. Leann, what's your, what's your tool of the month.

Leann Chen: Okay, so I'm not going to share a tool of a month. I'm going to share a tool of multiple months.

Jason Koo: Oh, better.

Leann Chen: So I've been using LangChain's DiffBot Graph Transformer for at least the past four or five months.

And it's a great tool because you just, you can turn any unstructured text data into a knowledge graph and load into Neo4j. And further query that knowledge graph. So this is my, tool of the previous multiple months and gonna be, like, a tool for the following months. Yeah.

Jennifer Reif: Long term tool of the month.

Leann Chen: Yeah. It's probably a little off the topic, but yeah.

Jennifer Reif: No, that's cool.

Jason Koo: Yeah. I think anyone that's, you know, working with LangChain and wanting to do graphs and graph RAG, it's, I think it's the Diffbot, mod transformer. And then was it LangChain that also made a graph transformer? Those are like the only two.

Leann Chen: Oh, it's a, so it's, it's a, that's the same thing as a plugin.

Like, LangChain just makes, you know, like you just have to pip install LangChain and everything just runs so smoothly, right? So it's like in the backend, it's calling the DiffBot API.

Jason Koo: Oh, okay.

Leann Chen: Yes, yes.

Jason Koo: Nice.

Leann Chen: These are, yeah, it's a beginner friendly.

Jennifer Reif: Very cool. All right. so with that, I think there's some, some articles and videos out there, but the thing we want to kind of highlight this month is the call for proposals or call for speakers for NODES 2024.

And you know, we did a chat with Leann just a little bit before the episode, and she plans to submit something to NODES. So, hey, if you're interested in hearing more from Leann, please come to NODES and register. but if you're also interested in Submitting something to speak. We would love to have anyone's graph story or experiences or projects kind of featured in our virtual online conference.

So I think just to kind of, do this segment a little bit, we just kind of want to talk about the CFP, maybe some ideas, tips for submitting, putting an abstract together and get kind of Leann's inputs as well from that. So does anyone want to kick it off or have thoughts?

Jason Koo: okay. So where, where, where do we go from that?

Okay. So, I think the CFP is pretty straightforward for NODES. and we have four different tracks, right? So we've got the AI track, you know, Gen AI. We've got, apps, we've got, graph track, which is kind of like visualization, visualization, visualization, tips and tricks, and then a data science track.

So pretty much anything that, that you've done or plan to do, you know, in the next couple of months, that's graph related. There's, there's going to be a track that it fits into. And we take talks from very beginner level to quite advanced. So whether it's just, you know, showing off, you know, combinations of products to get effects, DSPy, anything that, that touches graphs, is, is, is going to be a good candidate for, for a talk.

So yeah, I don't know, Leann, if you've had a chance to think about your CFP, but, is there. Do you have like a short list of topics that you're kind of leaning towards?

Leann Chen: Yeah, I definitely have. And I'm having a hard time to choose which to submit. Oh, okay.

Jennifer Reif: It's a good problem to have.

Leann Chen: Yeah. Yeah. That definitely will be something that I just have to think.

Cause, as I mentioned previously, like right now, Tomaz and I are doing like a, Diffbot and Neo4j. Type of collab of project. So that, and, and especially Tomaz, Tomaz being, putting so much work on, on that. Yeah. So I think that could be worth sharing too. Still thinking about that.

Jason Koo: That, that would make a great session. I think.

Oh, maybe each of us give maybe one tip to someone who's thinking about submitting a CFP. And, so I'm thinking, you know, someone who doesn't have a lot of experience doing talks and they're kind of like, Oh, you know, how do I get started? Where do I go?

And I can kick that off. so, kind of hark back to an old podcast that, that an old GraphStuff podcast that was done with Will. he had done an episode. With, Oh, I forget the other advocate, but he had done a long episode on like tips and tricks for general, like call to papers and like how to think about it.

And even to this day, it's, I think a great resource. They really went deep into like how you want to kind of. structure, how you want to kind of set up the talk and what type of topic to go into. so yeah, we'll have to put a link to that, that previous, podcast, but so that I would start with, you know, something like that, a resource level aimed at kind of first time CFP, submitters.

Jennifer Reif: For me, I would say, well, Leann, I think mentioned a good tip earlier is, is maybe run some ideas by ChatGPT too, you know, just to,

Sorry, I stole your thunder. but, just brainstorm some ideas. I think for me, I always try to pick something that I either want to learn or that I'm super passionate about, because I think that goes a long way.

Your abstract doesn't need to be perfect. If, program committee and audience members see that you're really interested in this subject and you want to learn and you want to share what you've learned and help others learn too.

And then everything I think will, will to some degree fall into place from that point. we have two blog posts out, one, on the Neo4j blog and one that I have submitted, through my blog and a couple of other, third party places, talking about Kind of tips and tricks for submitting NODES sessions or other conference sessions and putting together abstracts and, and things like that.

But, I would say, yeah, just for, for me, picking something that, that you enjoy and that you love or interested in, I think is going to be a great place to start.

Leann Chen: Me?

Jennifer Reif: Yep. Sure.

Leann Chen: So I, I, I think Jason's and Jennifer's. Advice like, like they are, you know, just follow, follow them, listen to them because they, they have more experience.

But what I think, what I can provide a little bit of suggestion, I guess, is actually, this is what I learned from, other, creators as well is you just share your experience in the LLM space. Because To be honest, you don't need to be perfect because we are all learning together and we need more, we need to learn or know more people's experience to learn together, figure out.

What the GenAI thing is going on, right? Because no one actually knows how these language models work, right? I guess. And so that's why we need more data. We need more data points. And this is like a collective effort from human beings, right? So I know, like, I can resonate with some of the, creators or people want to share, like, wow, am I not perfect enough?

Or, I don't think I'm good enough. I'm, I don't, I'm not qualified to share. Well, actually, no, if you share something, you're sharing different that you're definitely sharing experience that some people, some other people don't have, and you, they can benefit from your experience. Well, sharing itself doesn't exist.

You know, the question of, am I good enough? And I'm perfect enough. Because if, even if you're just like a week ahead of someone else, you helping that person or those people. To learn a week of knowledge ahead of them.

Jason Koo: I love that.

Jennifer Reif: Yeah. Thank you.

Jason Koo: Great advice.

Leann Chen: I learned that from someone else.

Jason Koo: Still, pass it on. That's fantastic. Cool. Jenn, should we, tell, tell the folks, what, what events that, we're, we're, we're jumping into this month?

Jennifer Reif: Yes. So I have a couple of things that are, that are virtual that are not showing up on the events page just yet, but I'll, I'll follow up on that, but my, my in person.

So if you want to catch me live and, and, in real life, I'll be at KCDC in Kansas city, Missouri, USA at the end of June. so if you're in the area, definitely hit me up or, catch me there at the conference.

I'll have a session, but I'll also be hanging around and visiting with, with developers as well. So I would love to catch you there. Jason, Leann?

Jason Koo: So I'll be in San Francisco twice in June. So I'll be up for, a joint session with Weaviate and Neo4j will be doing a joint kind of hack night. so in very, so I'm very excited.

I'm looking forward to your, Leann, yours and Tomaz's content, because it would be great if we could at least talk to it, or if we have time to integrate some of that. So doing that, and then also doing a talk at San Francisco Python Meetup Group. So that's mid June, and then at the end of June, hopefully, I'll be seeing Leann over at the AI Engineering World's Fair.

Leann Chen: Yeah. Wow. Wow. Wow. Wow. That's so exciting. Yeah. me and the Diffbot team with our CEO, Mike Tung and Jerome Choo will be at the AI Engineer World Fair. So super, super excited to meet some of the Neo4j folks. Like Jason will be in person.

Jason Koo: It'll be awesome.

Jennifer Reif: Great. I will link all the other content and events, and everything that's involved for this month. It will all be in the show notes. So thank you again so much, Leann, for joining us. We really appreciate it.

Have enjoyed getting to talk to you and hearing about your content and your process and kind of getting to chat in person, even though we've seen kind of your recorded videos and stuff already. So this was, this was really nice. We hope to have another opportunity to do this in the future.

Leann Chen: Sure. It's a, as I mentioned, such a pleasure. like I, as I said, I've been watching videos from you guys early on. So it feels so, it just feels so, so nice to be able to be in the same virtual room with you guys. Yeah, hopefully we'll be in another session.

Jennifer Reif: Yeah, hopefully sometime soon in the actual like physical space together. So.

Leann Chen: Yes, I would love that.

Jennifer Reif: All right. Great. well, we will, talk to, everyone in the next GraphStuff.FM episode. So, enjoy your month. Happy coding.

Jason Koo: Thank you, everyone.

Leann Chen: Thank you. Bye bye. Bye.