GraphStuff.FM: The Neo4j Graph Database Developer Podcast

2023 Finale: LLMs and Knowledge Graphs throughout the year

Episode Summary

As we prepare to close out the year 2023, we don’t want to wind down and put our feet up just yet. Instead, we’d like to review what we at Neo4j (and a lot of the tech industry, in general) have been spending much of our time thinking about and building around….Large Language Models and Knowledge Graphs. Joining us today to help us dive deeper into these topics are Tomaž Bratanič and Oskar Hane.

Episode Notes

Episode Transcription

Jennifer Reif: All right, welcome to another episode of GraphStuff.Fm. I'm your host, Jennifer Reif, here with fellow Tech Adventure and advocate, Andreas Kollegger.

Andreas Kollegger: Hello.

Jennifer Reif: As we prepare to close out the year 2023, I can't believe this is already our last episode of the year, we don't want to wind down and put our feet up just quite yet. Instead, we'd like to review what we at Neo4j and a lot of the broader tech industry in general have been spending a lot of our time thinking about and building around this year large language models and knowledge graphs. Joining us today to help us dive a little bit deeper into these topics are Tomaz Bratanic and Oskar Hane. Tomaz is a network scientist at heart working at the intersection of graphs and machine learning. And Oskar is a software engineer now for two decades, still looking to learn more every day, hopefully as we all are. Welcome to you both and thank you so much for joining us.

Tomaž Bratanič: Happy to be here.

Oskar Hane: Yeah, thanks for having us.

Andreas Kollegger: We could maybe just retrospectively just paying attention, here we are. It's just about exactly a year ago that ChatGPT happened, right? And I remember when it first it came across, I don't know what the news channel was, where it saw it. I was like, "Oh that seems interesting" and didn't pay much attention to it, I didn't think much about it. Could not nearly have guessed what has happened since then. And as you say, Jennifer, our two guests today, you guys carried the ball forward for Neo4j over this past year from like, "Hey, this is interesting", to like, "Wow, we're doing amazing stuff". And I guess maybe part of what would be great to hear from both of you guys, what was your experience from, okay, there's this ChatGPT thing that happened to suddenly you're doing stuff? What was that transition like? Early on, was this just busy work or did it seem like you were excited? Oskar, where were you then? Where was your head?

Oskar Hane: Yeah, good question. First of all, I was blown away of like, okay, I can interact with this thing that actually knows, seemed like everything. It's like a magic person to talk to that just knows everything. But it's like the initial thoughts were like, this is going to change society really quickly. But how do you say, that slow down a little bit that the more you used it, which is once you saw that okay, maybe once you go into the details maybe it doesn't know all that much. You perceive that it knows a lot, but maybe it doesn't really know all the details.

But the transition from chatting to it into, okay, this is still super, super interesting, but what can we do with it with data that's not public? I think that's where I started. And initially there were a lot of early experiments of using the APIs once the APIs opened. Well actually before that, I think someone published a library in the unofficial APIs that you can use to build apps on top of. So that's where I started to experiment with building all applications to use the API.

Andreas Kollegger: Like the early hacker days, like okay, there's something you can use so let's see what you can do with it, right?

Oskar Hane: Yeah, totally.

Andreas Kollegger: Cool. And I love what you're saying about the, I think that was of course in everyone's experience it really does seem magical when you hear these metaphors like it's alien technology or it's like all this incredible power and the promise of artificial general intelligence is just around the corner. And maybe now we're all a bit more calm about it, it's still really incredible, but it's really good auto complete, sometimes really bad auto complete. In the same way the auto complete sometimes it's like, no, no, that's not what I was typing, I was typing something else. I feel like you get into the same corners and problems and challenges with it.

Tomaž Bratanič: Yeah, I feel like it's a bit more than that because I feel like it's talking to, the first time I've thought about this, basically talking to an overexcited ADHD person. And dig along those lines because it's really there, it wants to help you, and it's licensed but not really. And it can lose focus and then it really depends, does it have a good day? Is it a bad day?

Andreas Kollegger: It seems moody sometimes.

Tomaž Bratanič: Yes, exactly. And now what we are seeing is basically it's becoming lazier a bit, so it doesn't want to do things so it's like it says, "Okay, just copy this from other stuff". So yeah, I feel like it's dealing with, I don't know, not exactly a toddler, but maybe it's like a happy junior developer.

Andreas Kollegger: I guess were doing work with it. I guess Tomaz, you as well, you were there in the hacker phase before there were officially APIs because before LangChain had happened and LlamaIndex and things that have followed on after that. I guess from a developer perspective, like okay, this is a cool new toy to work with, what was your first instinct? So I guess Tomaz, asking you first, what was the first thing you're like, "Okay, here's what I want to try to see what this thing can do"? Was it just text or did you want to try to build stuff with it?

Tomaž Bratanič: So basically it was the first thing I tried was how good it is at generating Cypher statements because I wanted to do this for the long time because I don't know if you're aware, but there was an English to Cypher project from four years ago or something and nobody has ever done anything similar and it takes a lot of work because how I always saw was basically you've got a knowledge graph in the middle. The input is like machine learning/NLP, which is now LLM. And then the output is also machine learning/NLP, which is now also LLM. So I've always been interested in generating knowledge graphs from text and I've done that before the LLMs, right?

So you have some domain-specific entity extraction models, you have relationship extraction models, stuff like that. And then you can combine those and create some graph. Now it's good for toy presentations and very domain-specific, but it's not there yet. At least you need a considerable investment. So I did that so they didn't spark my interest as much, but like the other part when you're basically generating Cypher statement based on user input, that was basically the first thing I tried out because I remember, now I'm not going to remember his name, but there's this person who wrote a lot about biomedical domain for Neo4j.

Andreas Kollegger: Oh, biomedical. I thought you were going to refer back to Max De Marzi who's the first time I remember people experimenting with natural language like [inaudible 00:08:05].

Tomaž Bratanič: What was his name?

Jennifer Reif: Is that Sixing?

Tomaž Bratanič: Yeah, yeah, exactly.

Andreas Kollegger: Oh yeah, great.

Jennifer Reif: Yeah.

Tomaž Bratanič: He did even before the ChatGPT, GPT-3.5, right? And he had blog post with GPT-3, so the previous version where it was not conversational, and he had examples of generating Cypher statements. So what he did, he just passed five or 10 examples question Cypher and then asked basically the model to generate a Cypher statement based on that. So that was basically how my first basically API call to OpenAI was that.

Andreas Kollegger: That's cool. It's like remembering your first tweet. This is my first call to the API. I remember the day I made the call.

Tomaž Bratanič: Yeah. And then I remember when we were all waiting for the ChatGPT, like the GPT-3.5 to release, to be available via an API. We were all very excited about that.

Jennifer Reif: A lot has changed in just a few months.

Andreas Kollegger: Yeah.

Tomaž Bratanič: Yeah, exactly.

Andreas Kollegger: And Oskar, you've been carrying on some of this work, right? I guess since, and I don't actually know what the state of language models were, how many there were available prior to a year ago I suppose. But certainly now with Hugging Face, there's hundreds of models to choose from. And this idea of co-generation people have realized, okay, you actually want to tune an LLM for that particular task. And Oskar, you've been looking into that for okay, can we do better than just out of the box Cypher generation? Is there a co-generation LLM that we can help along? How has your investigation gone along with that, Oskar?

Oskar Hane: Yeah, so I've assembled a team and we spent two months on that, on fine-tuning LLMs, the different models to evaluate their ability to generate Cypher and the improvements after the fine-tuning them. So it looks really, really promising. We wanted to keep the project short and tight to not make any huge research task, but to see if there's something there and we saw significant improvements already. So this would probably be something you want to spend more time on in 2024.

Andreas Kollegger: Could you give me the dumbed down version of how you fine-tune for Cypher? Is it just like, oh, here's a bunch of Cypher statements. You just send it to an end point and it says thanks? How do you know you're doing a good job of fine-tuning? What's the process actually like?

Oskar Hane: Yeah. Well first of all, you need to have really good data, high quality data. So quality is much more important than quantity. And you want to have question and answer pairs for that data. So you pass in natural language, you define a natural language question, the schema of the database and then the expected result, and then how you fine-tune depend on which model it is. If it's through OpenAI, you upload a JSONL file with all these question and answer pairs on their website and then you do a GTP request to start fine-tuning, then it automatically does that. And it passes it into them and you can see the, what is it, the loss curve or something in their UI. Whereas if you want to fine-tune an open source model like Code Llama or Llama 2 or something, you would do that through PyTorch or something like that.

Andreas Kollegger: Okay. So do you get a sense of, how do you know that you've done enough I suppose? Because in the pure natural language aspect of this, there's all these notions that it's at a certain level of scale that new patterns emerge, the abilities are emergent behaviors that come out of lots of data. To your point about the quality of data with these question and answer pairs, do you see a cutoff point where okay, 10 does nothing but 100 or a... Is there some way to know when you start to get past a threshold like it's starting to understand or improve?

Oskar Hane: Yeah, that's a good question. So you also set up on the side, you set up a test dataset, which is not included in the trained dataset. And then you run those tests and save the results of that so we can see how the fine-tuned model does and compare between the models then. So we did a few different setups where we ran all the training data once or twice, or I think seven times at the most. It's called-

Andreas Kollegger: It's the same data. You're just-

Oskar Hane: Yeah, same data. It's called epochs. You cannot override, you change the weights in the network by doing it multiple times to evaluate, or can we find what's the ideal number of epochs to use and things like that. But we didn't go too deep into the size of the training data, we always used the same size in this project because of the time restrictions, but we did some of these things.

Andreas Kollegger: Because we spend a lot of time within Neo4j, just on the data side of course. So here you're actually bridging over into the AI engineering side, the [inaudible 00:14:13] I said those, this may be the right phrase to say. And so you think that this is somewhere that's been promising enough that will probably continue down this path, you know this is going to get something useful just out of it?

Oskar Hane: Oh yeah, for sure. I mean, we want to expose these to our users in our graphical tools so we can help them author Cypher queries. Maybe not automatically in the background, but as a starting point for them to iterate on the query for the complicated ones. So yeah, for sure.

Andreas Kollegger: So when we've talked to customers about GenAI, what they're interested in, because sometimes we talk about these, the three buckets of different datasets you've got, and this is over on the if you just have structured that you already have a knowledge graph or just a graph, you do something like this natural language to Cypher generation and that's what that interface would look like way on the other side. And actually I should have changed at least on my screen where I'm gesturing because way on the other side is where Tomaz has been spending a lot of time like, okay, if you have unstructured data or just pure text, there's an entire other pattern that emerges there, right? And I know this has been well-trained, but Tomaz, you want to lead us through what you've been doing with RAG and-

Tomaž Bratanič: Yeah. So first, let me give a disclaimer. Retrieval Augmented Generation just means that you're passing context to the LLM at query time. So that doesn't mean it's limited to vector index, it can be anything. It can be because the first, even if you look at LangChain, what they did at the very beginning, they plugged in Google Search, right? So you have an LLM that's able to use Google Search. And then I don't know exactly what they do, but basically you look at the first three webpages, scrap the HTML and feed that to the LLM and then hopefully the result is there, right? So the Retrieval Augmented Generation is not retrieval strategy-specific, it's anything, right? It could be your model whispering into the LLMs, so yeah.

Jennifer Reif: So it's actually more about Retrieval Augmented Generation being you're actually just getting references from literally anywhere outside the LLM and using that to feed the LLM?

Tomaž Bratanič: Exactly.

Jennifer Reif: Okay.

Andreas Kollegger: So you got a random dad joke generator that just tick... Whatever the user asked for, you x]can add a bunch of dad jokes to it and then hand that to the LLM, it's still RAG.

Tomaž Bratanič: Exactly. And you say, "Please do not repeat, but make something new", right? So [inaudible 00:17:13].

Andreas Kollegger: Does prompt engineering happens just right there, yeah?

Tomaž Bratanič: Yes. So basically this is an important. And now why everybody thinks that RAG is limited to vector search or vector similarity search. That's why, because it's the easiest thing to implement because in the beginning what we saw a lot was ChatPDF applications, right? So what you do, you take the PDF, you extract text from the PDF, you need to chunk the text a little bit, right? Because most LLMs at the time had a limited token space. And even now that we have 100,000 token space, it still wants to chunk most of the time because putting in big context or long prompts to the LLM, first of all it costs money, every token costs money. And then it also increases the latency, but also the more noise you have in the prompt, the less accurate then. So even with this long LLM token spaces, you still want to chunk text.

So let's say you chunk by each page or by each paragraph, whatever is your strategy, and then you use this magical model called an embedding model. So the embedding model takes text natural language and produces basically a list of numbers that describe that text. So it's a magical thing that happens where you take text and now you have a bunch of numbers describing that text. So what you then do is just store those numbers presenting text in the index, so you index them. And then at creating time, you take the user input, you embed it and then compare the embedding of the user inputs to the embeddings of the documents. And most often you use cosine similarities. So basically it's high school trigonometry, like [inaudible 00:19:48] elementary math, and you just say cosine similarity is your method like how you compare the documents to the user input. And then you simply return the top and similar documents and that is your retrieval strategy.

And now why everybody basically that was what everybody started with is because there's a very low barrier to entity, right? You don't have to do anything basically. You can have a default chunking strategy, there's no pre-processing step, there's nothing. No manual labor, it just works right every time. Now obviously there are problems, but when you have small PDFs, it works relatively well, right? So the reason why I feel like the vector RAG was the first and the most mainstream is because of the low barrier to entity. Basically anybody can do it, you just align two or maybe five lines of code and it works right.

Andreas Kollegger: Okay. I mean, one of the things that occurred to me with this similarity search, and I think it is a scaling problem as you're mentioning, is it's fine if you've got 10 documents and they're all different and that's fine. If you've got one of these contacts where you've got a million RFPs and in the RFPs, the common text is going to be 60% of the RFP, right? And so that doesn't help you narrow it down the search space, right? When most of the content is actually similar, the difference isn't that much. So depending on what the user asked, it's all up to the user to ask specific enough stuff to get the stuff that's relevant out of the million documents, right? And if they know specific enough stuff, then they can probably just get the document. At some point, you're not searching, you're just saying, "Hey, get me that document because I know the document that I'm looking for".

Tomaž Bratanič: Yeah, and I mean, it's also this little detail that not a lot of people were talking about, but when we did the NaLLM Project a couple months ago-

Andreas Kollegger: What was that?

Tomaž Bratanič: ... so the NaLLM Project, maybe Oskar can tell about the NaLLM Project.

Andreas Kollegger: Do you want a quick sidebar [inaudible 00:22:19], Oskar?

Oskar Hane: Yeah. Yeah, sure. Yeah, that was the first project we started I think in May because we saw a lot of interest from customers in this space. So we wanted to start a Azure project to reach out to them to collect use cases and their general interest and in the meantime build something in code on the side to see what's the limit of the LLMs today, what can we do today with knowledge graphs and LLMs. Not sure where Tomaz is going with this, but that's the background.

Tomaž Bratanič: So now, let me now continue. Okay, so during the LLM, we also played around with vectors, right? And what I noticed is that I had a specific query, so what's the latest news, right? And then a second query, what's the latest articles? And you would think that those two queries would return very similar because our dataset was basically a bunch of news, right? So the two queries, what are the latest news and what are the latest articles, returns completely different results, right? And it's just a single-word, news or articles, right?

Andreas Kollegger: Yeah.

Tomaž Bratanič: And who's going to debug this? Who's going to optimize these three things? How are you going to educate your users? Right? News don't use articles or use articles when you're searching for this, but also it's like there's a lot of tiny details that I haven't mentioned for vector similarity search.

Andreas Kollegger: It reminds me one of the many memes around LLMs and GenAI is that GenAI is a sock puppet, right? That it's really just, it is you talking to yourself and okay, maybe he knows more than you do, but it is so dependent on it's you operating the thing. Whoever that hand is is going to change what the responses are and it's fascinating. Does your intuition, Tomaz, say that for that nuanced difference between answers for articles versus news, is that a problem with the embedding? What do you think is root cause there? How does that happen?

Tomaž Bratanič: Maybe it's not even a problem, it's just how things are because maybe news and articles are not the same thing, but we know specific domain or video specific dataset they are. But in the real world, maybe news articles and articles are not the same thing. Maybe articles are more like research papers and news are more like like news. So it might not be a problem per se, but also things hidden in the embedding model that you first of all have no influence over. Now we see that there are emerging strategies for fine-tuning embedding models, but basically it's fairly opaque also because you basically don't know because it's like a magic box. You put in text, you get some numbers out and just hope for the best, right? Basically that's AI.

Andreas Kollegger: Hope for the best. Being prophetic now you're looking at the future, like reach but you can only hope for the best.

Tomaž Bratanič: Yeah. And if something doesn't work, you don't know what to do, right? Because even with LLMs, there's only so much thing you can fix with prompt engineering. Okay, so it's something that you can give additional instructions, but then at some point it just like the ADHD takes over and it just lose its focus and then you can't do much about it, right? When you have a demo, you just hope that today it's feeling good.

Jennifer Reif: I think the soft skill of expression and communication is just not something we've mastered yet. Humans are pretty good at it, I mean, even sometimes we fail at expressing thoughts eloquently or relatably to other people, but I think that's one thing that computers still lack is combining that soft skill and really mastering it yet, which I mean understandably so.

Tomaž Bratanič: Yeah, but now that you're talking about soft skills, there was a paper where basically if you put emotional, how do you call it, I call it basically gaslighting, but emotional pressure on LLM, it performs better. So if you tell that the GPT basically my job depends on your answer, it's going to perform slightly better.

Jennifer Reif: Interesting.

Tomaž Bratanič: Yeah. The Google one was basically for the Google LLM, I think the prompt was take a deep breath and think about it. And it improves the results, actually. So basically LLMs do understand and basically, there is a need for like emotional gaslighting of LLMs.

Andreas Kollegger: That is fascinating.

Oskar Hane: Early on when the term prompt engineering appeared, I was laughing and thinking, I mean, that's not a thing. It's ridiculous to call something like that engineering, but the more I learn about it, the more true it is because you have to learn what the LLM wants to hear to get what you want basically.

Andreas Kollegger: Do you think, and this is trying to guess the future, right? Okay, so prompt engineering makes sense right now, it's probably something we'll need to continue to work with. But thinking about your comment, Jennifer, about the soft skills that we have as humans, we're pretty sloppy and inexact in our communications, except we take so much more than just language when we're communicating with each other, right? When its voice says as it is now amongst the four of us, it's our pacing, the pauses that happen, the emphasis on words, the emote, the ups and downs of the tone. And at least for us, we can see each other and we're taking visual cues as well. All of that gets fed into the context of like, is Tomaz thinking about what I'm saying right now? Is he just bored and looking to get at? I can pick up that without him saying anything, right?

And the LLM has none of those clues. It does not get any of that extra context from us that you do when you have conversation. So if you're casual in a conversation with the LLM, it's not enough. And so I guess that's maybe what you're getting at with Tomaz with the gaslighting, as you said. It's like okay, you have to tell it, "No, no, no, take this seriously. You can't hear my tone of voice, but I'm going to tell you this is very important, so don't mess around". And it changes the neuro pathways that get triggered and however it flows, right?

Tomaž Bratanič: Yeah. And by the way, now that you're talking about visual cues, I don't know if you watched OpenAI other day, but now we have VisionLLMs as well. So now, I mean, it's just a matter of time. Now we are at the state of images, but sooner rather than later we'll also have video inputs. So you'll also be able to have an LLM basically look at you and basically hear you and what you're saying and how you think. So those cues are coming, definitely.

Oskar Hane: Yeah, I was going to say that that right now they're pretty good at describing a picture actually, like what's in the picture, the mood of the picture, the expression of if there's a person in the picture. So if you feed that to the other-

Andreas Kollegger: Can it get some sentiment analysis of pictures, like these people are happy?

Oskar Hane: Yeah, yeah.

Andreas Kollegger: All right. So now we're getting a little bit into particular in the future so I'm going to keep you guys on this actually. Are we solving problems that are just today problems that next year, given the pace of improvements, could you see that some of the things we're trying to solve just simply aren't going to be problems next year?

Tomaž Bratanič: Yeah, I mean, a lot of prompt engineering will hopefully go away because basically the prompt engineering how is it we're just making up for the lack of LLM understanding so the better LLMs will get the less prompt engineering we need because what you can also see is that if you take for example GPT-3.5 and GPT-4, with GPT-4, you need a lot less prompt engineering and it have to be less precise, less tokens because it's just better at understanding. So that's definitely something. Basically LLMs would be better and better at getting those cues.

Andreas Kollegger: Do you agree, Oskar? Temporary problem?

Oskar Hane: Yes, I think that part I agree with. I think so. The base trained LLMs they have today are useless, so they put layers on top with alignment and all that kind of stuff, which in turn turns into GPT-4 or ChatGPT even. So they already have layers on top, and then prompt engineering is just another layer that's more specific to the use case. So I don't know how that would be expressed in the future, but I think the way we send it now is a bit like a system message to the LLM how to behave. I'm not sure if that's going to survive, but possibly. But the other part on how can we give the LLM knowledge of non-public data for example, or RAG stuff, I think that's a much, much harder and I think it is quite an interesting question on how can we feed data to LLMs in the future or GenAI models in general. I think RAG is a really good solution for now, but yeah, I think we'll probably see some developments there on combining at a lower level than prompts, on knowledge and LLMs possibly.

Andreas Kollegger: Oh, interesting. So like a Neuralink, like Elon Musk's Neuralink, but instead of for humans, it's for LLMs so we can tap straight into the brain as you're saying?

Oskar Hane: Who knows? Yeah.

Andreas Kollegger: So the other fun stuff that's going on around this, actually, I'm going to ask you guys about this because in the background of all of this, there's okay, next generation of the LLMs, the foundation models, those who can continue to improve. There's been some rumors about Q* when there was the kerfuffle at OpenAI with Sam Altman and the board and all that, and that there was news about, okay, people are maybe concerned that the next gen stuff they were about to do was going to be danger to society level of powerful.

With all of this, there's a background thread or discussion of artificial general intelligence and it's just around the corner, and there's all of that sense in the general public of like, okay, there's a lot happening here that we don't understand and it's super advanced, it's going too fast. At the same time, the comments in this past hour are about like, "Oh, it's a toddler with ADHD", those are very different outcomes, right? There's the super intelligence and then there's a toddler with ADHD. If you had to guess in the next year, next two, how far off could you imagine things are going to go?

Tomaž Bratanič: Yeah, I think they're going to go really far basically because what we see is just how much things have happened in one year. We went from basically the GPT-3, which it was actually a simple [inaudible 00:35:30] complete to now basically ChatGPT with all basically conversation and reasoning and then agents and all of that. So I would say that it's basically the exponential curve, right? It's not going linear, it's going like this so it goes really fast. So basically, I don't know. Maybe we don't even have the words to describe it.

Oskar Hane: Yeah, I think I'm a bit [inaudible 00:36:14] on it because the more you learn about it, the less magic it is. But at the same time, the progress as Tomaz described is also really, really impressive. I read both sides, the sides that think we have a scary future and the other side that don't think that. I think Yann LeCun, the chief scientist of Meta who creates Llama 2, he don't see a doomsday future of LLMs. He sees the LLMs like a white box rather than a black box that we can align quite easily, more easily than we can do with animals, for example. So yeah, I hope he's right.

Andreas Kollegger: What does that mean as a term like white box, I guess maybe we can guess at that, but what does that mean, a white box rather [inaudible 00:37:16]?

Oskar Hane: That we can look into it I assume because a black box, we don't know that that is, right? Yeah.

Andreas Kollegger: Okay.

Tomaž Bratanič: It's maybe transparent or something.

Andreas Kollegger: Okay.

Oskar Hane: Yeah. Yeah, he express it white, but yes.

Andreas Kollegger: Cool. All right. I know they're wording late into the hour, but I love this discussion. Jennifer, do you have any other thoughts you want to bring into this or should we-

Jennifer Reif: No, I think we've covered I think some really good ground that it's RAG and LLMs and so on are not a magical silver bullet like every other technology that's come out before, but there's some really amazing things that they can do. There's some gaps that they are bridging and some dots they are connecting that we've never had access to before, and I think I'm really excited to see what next year brings as far as the really amazing societal and technological things that we can do with this to hopefully improve life and the world around us.

Andreas Kollegger: Yeah. I guess it's like any new technology, new power, it can be used for bad, it can be used for good. It's not really the technologies itself that's going to make the difference, it's the people who are using it and that's where the real grave dangers or benefits are.

Jennifer Reif: Yeah.

Andreas Kollegger: And there's more good people than bad people in the world, but mostly-

Jennifer Reif: Hopefully.

Andreas Kollegger: ... it's safe, right? Yeah, hopefully. Of course. Of course, there are.

Oskar Hane: I just want to add one thing, and I think we are super early in this development, right? The RAG pipelines we see today, we don't know really good way to evaluate the quality of the output of a RAG pipeline. There are a few frameworks, but we're so so early. So I think I'm looking forward to having more tools to evaluate the quality of the output. So if I make a little change in my chunking early on in the pipeline, how does that affect the quality of the outputs all the way at the end? It's like yeah, we don't have that today. So I'm looking forward to those kind of tools developing in the next year.

Jennifer Reif: Yeah.

Tomaž Bratanič: Yeah. And one thing maybe that's mentioned is what everybody agrees is that chat is not the best user interface for LLMs, but nobody knows what is the better alternative. But we all know that it's not the best thing because the chat interface basically, you still need to have a lot of knowledge about if you're using a RAG, you need to have some knowledge what's in the database, how to basically do follow-up questions, basically how to go in-depth, right? There's no one, no handholding, nobody handholding you and pushing you along the steps. So I feel like we'll see those handholding guiding hands in UX sooner or later, but I hope that that's another thing that's going to improve.

Jennifer Reif: Yeah, the value of your output's going to depend on the value of the input and how you help provide those better inputs is [inaudible 00:40:37].

Tomaž Bratanič: Yeah, yeah, exactly. It's just like you're, basically I was going to say modeled, but basically you're asking that question, but it's, "Oh, did you actually mean that?" Right? And yeah, I have an answer for that. No worries, I understand what you're talking about, right?

Oskar Hane: There's a dialogue [inaudible 00:40:55], right.

Tomaž Bratanič: Yeah.

Andreas Kollegger: So that's great. There's basically a design challenge here that we don't know the right path forward for it, but that's interesting. Okay. Do you guys want to stay around a little bit longer? Jen, we can maybe go through our tools of the month.

Jennifer Reif: Yeah.

Tomaž Bratanič: Yeah. I mean, I have to drop off, I'm sorry, but-

Andreas Kollegger: Because he's got to write another blog post about-

Tomaž Bratanič: I got to join another meeting, I'm sorry.

Jennifer Reif: We'll see how much more content Tomaz can produce before the end of the year, right?

Tomaž Bratanič: Yeah, I mean, I maybe con Andreas. He owes me a blog content.

Andreas Kollegger: I do. Post writing will happen as soon as I finish this.

Tomaž Bratanič: Okay.

Jennifer Reif: Well, we definitely appreciate your input, Tomaz. And thank you so much for joining us.

Tomaž Bratanič: Yeah, happy to join us you as well.

Andreas Kollegger: Cool.

Tomaž Bratanič: Okay. So see you soon.

Andreas Kollegger: All right, Tomaz. Take care.

Tomaž Bratanič: Bye.

Jennifer Reif: Cheers. Okay, so we want to dive right in with our tools of the month.

Andreas Kollegger: Yeah. Shall we? We've got a little while left in the hour.

Jennifer Reif: Yes.

Andreas Kollegger: Okay.

Jennifer Reif: We'll scoot through our community stuff here very at the end, Andreas. But for my tool of the month, we like to highlight tools we've been enjoying, our favorites of ours or anything we've been really using the last month or so. And one that I have not used yet but I'm really excited about is there are some APOC procedures that have been added recently, I just found out about this yesterday, that are embedding procedures. So you can generate embeddings, you can create Cypher statements, you can query Neo4j and just have it describe the model to you and a few other things. So I'll put a link in the show notes for that, but I found as I'm digging into the LLM stuff more and more and RAG, a complicated thing is well, I have this data but I don't have embeddings for it.

And so this would be another way. If you're already in Cypher or if you're already dealing with Neo4j interactions, you can use an APOC procedure to help you generate those embeddings and of course several other procedures that'll do LLM embedding type stuff for you too that are in that list. So definitely feel free to check that out. I will definitely be playing with that and seeing what all it can do, but really exciting stuff there.

Andreas Kollegger: Nice. APOC is I think a fountain of giving. It just keeps flowing forth. I know, I'll do that metaphor, but that's awesome. I have a really bad LLM and I have a very small context window, and I might've mentioned this tool before. I tend to repeat things, I usually talk about arrows, but now I might be repeating myself on this tool called Warp, which it's just a terminal application. It's a modern terminal application written in Rust I think, and I want to bring it up because it is just a terminal, okay? Right? So it's just fire a bash and do some command line stuff and it's not that exciting, except I'm so excited about it that I want to find reasons to use it because it's so cool. All the extra things that they've added to it, all the different, it has a bit of AI built in so it helps you actually understand what you're trying to do with commands.

And it is one of those tools that's just, I don't know, it just makes me happy. So even though I don't need to spend a lot of time on the command line when I'm like, "Oh yeah, I've actually got to do a bunch of grepping through some directories". I'm like, "Oh cool, I'm going to pop open Warp and I can get to it because it makes me happy". It's that level of joy even on a command line terminal tool is a good reminder for any of us who build software. It's possible to take a terminal and make it fun and make it enjoyable, right? If that's possible, whatever you're building, it should be possible to make it actually fun for people so that they want to use it, so that they find reasons to use it. So I've got two reasons for loving this because I just love it in and of itself and then I love that I love it, if that makes sense, right? Oskar, do you have a tool that you'd like to mention here?

Oskar Hane: Yeah. Yeah, I do. I find Llama, which is a tool from Llama.ai. It's a LLM management tool I would think, it's like docker for language models in a way. So they have a registry of models that you can download and you download model and run it locally. It spins up an HTTP API in front of the model so you can build applications and add AI capabilities to it without leaving your computer. And I find that really, really, really cool and it works really well. I also saw that they successfully deploys to-

Speaker 5: Success, but to the [inaudible 00:46:02].

Oskar Hane: Sorry, I triggered my speaker over there. What was I going to say? Yeah, I see that they have deployed to production environments as well, so you can develop locally and publish stuff, open source models like Llama 2, Code Llama or everything. Yeah, [inaudible 00:46:21] really useful and super cool to run it locally.

Andreas Kollegger: Awesome. And that was part of the GenAI Stack that you guys worked on at DockerCon, right?

Oskar Hane: Exactly. Yeah. So I still use it almost every day. You can pop into a terminal, you don't have to use the HTTP APIs. You can just enter a ripple and start chatting to it, yeah.

Jennifer Reif: Oh, that's really cool.

Andreas Kollegger: I wonder if there's Warp integration with all Llama [inaudible 00:46:50]. Cool, thanks for that. Oh, and this is for people who have been waiting to hear about it. Jen, is there any Neo4j news that we should be bringing up?

Jennifer Reif: Yes, new Neo4j and Graph News that's out there. We actually have two brand new Graph Academy courses for LLM fundamentals. This one just walks through the introduction and definitions and terminology and building a, I won't say simple or basic, but a starter app if you will to work with LLMs. If you're new to LLMs or even just the GenAI space in general, no matter your language of background, I think there's a lot to be pulled from from that course. The other one, the course that's available is a chatbot with Python and Neo4j. I'm actually in the middle of this course, I have not completed it yet, but really good information there.

It's a little bit more, I won't say maybe complex I guess, but it takes the next step and the next level of the fundamentals course where you're building a chatbot and more agents and things working with Python and Neo4j. And then the Import CSV course actually got an update, so it's been out there for a while. If you've taken it before, feel free to go back and refresh. Or if you're new to it, definitely check that course out on import.

Andreas Kollegger: I'll mention some product updates. I guess the big news, and for those of you who are paying attention, there's of course OpenAI. We shouldn't forget there's other people out there doing stuff in this space. There's still Amazon has some a cloud service, they call it Amazon AWS, what is that? Web services or something, whatever that stands for? Right?

Jennifer Reif: Something like that, maybe.

Andreas Kollegger: AWS is probably not quite just web stuff anymore. And of course, they've got their own stack of things and services around language models and building out RAG services. And there's this fantastic new partnership between Neo4j and AWS that's been announced as part of this week. We're just going through AWS re:Invent while we're recording this here today. And so AWS Bedrock, using that in integration with Neo4j has just been announced this week, so we're pretty excited about that. There's a whole slew of things that are possible, clicking some buttons to get through some panels and things, but you can get stuff going up on AWS in the traditional way there. So that's pretty exciting. We're pretty happy about that.

Jennifer Reif: Looking at articles and other content that's starting to be released just over the last few months. So I feel like we're getting to the end of the year, things are not moving quite as quickly maybe, they're ramping down a little bit before the end of 2023. But actually there's some really amazing pieces of content that I've looked through and seen pop up over the course of this month. The first one is RDFLib-Neo4j, a new era in RDF integration for Neo4j. So it's a blog post out on the Neo4j developer blog, but it's an alternative or another option for importing RDF data. So if you've dealt with this area before, Neosemantics has been the go-to, but it only works with local instances where this new RDFLib is a Python library and you can use it with cloud-hosted databases now as well.

Py2neo, for those of you who have maybe been around the space, Neo4j for a while has now come to end of life. So the Python Library, Py2neo, a way for Python developers to interact with Neo4j, like a driver if you will, but just building a little bit more integrated with the Python mindset and some features and functionalities that Python developers are used to or that are really nice to have. So it became end of life this year and so there was a blog post written for helping you migrate off of Py2neo onto other solutions. So the recommendation is to either go to the Python official driver or to something like Neo model, and the blog post will walk you through which one of those to use where, how to migrate off of it. But the examples are going to give you a migration path for Neo model. Enforcing data quality in Neo4j 5, new property type constraints and functions.

I've had lots of user questions over the years asking, "Well, how can I ensure data types for the data that's getting pushed into Neo4j?" And because we're schema free or schema optional if you will, or schema flexible, however you want to use that, term it, this hasn't previously been possible, but now there are new checking of data types functions that are in the new Cypher. And you can also add new constraints to restrict the data types for properties. So it'll mention a few. The blog post walks you through all of this, gives you example code and shows you where to go for more information. But it also talks about a few edge cases where the constraints and the functions may not work like you expect them to work. So a few gotchas and pitfalls to be aware of as you're using these new features.

The next one, Tomaz had actually mentioned I believe, looking at taking older solutions and modifying them and using them with LLMs now this year. And so this was a blog post analyzing annual reports using LLMs and Graph Technologies. And the author was looking at a solution they had built, a Neo4j colleague had taken this solution they built back in 2019. They'd used a lot of NLP technologies and such, but they were saying, "How can we update that and maybe improve it using the new technology of LLMs?" So what does that update solution look like? So instead of leaning heavily on those NLP APIs, the LLM can do some of or maybe a lot of that work. And so he walks through that, which I think is really cool that not only are we looking at what can you do now and in the future with LLMs, how can we improve existing solutions or older solutions and maybe make them more possible or better than what they were at a previous point in time.

So really cool article there. The Needle StarterKit, I tried saying that three times fast, the ultimate tool for accelerating your graph app projects. There's an article out there for that. The Needle StarterKit is out of the box components for developers to use when you're developing applications and interact with Neo4j. So this just gets you up and running, building applications with Neo4j. I almost think of it like a spring timely starter. It gives you the base outline for the app and then you can customize and fill in gaps and build some cool things from there, so. The last article that I came across was by Nathan, Clustering Graph Data with K-Medoids, which sounds way complex for someone who's not super into algorithms or is not super steeped in algorithms.

But the article goes through and really explains what K-Medoids is. It's an algorithm that's similar to K-means, it's an approach for finding clusters in your data. It walks through how to apply that to graph data and then visualizes and charts the results of a couple of different datasets using Amtrak train routes throughout the US and then also genetics. So a couple of different opportunities to look at it there. And of course, visualizations always make that picture so much more meaningful to me at least. So just a really good article and nice walkthrough there.

Andreas Kollegger: Cool. A lot of good stuff, Jen. A lot to read through.

Jennifer Reif: Yes.

Andreas Kollegger: I know that we've had videos. Of course, there's always [inaudible 00:54:29] videos that are available there if you happen to have missed some of NODES 2023 because there was only 300 hours worth of video if you rewatched all the tracks across all the different time zones. We're still processing all that video and chunking it up and getting it in segments. So if you go out to our NODES 2023 playlist, new videos have been added. So if you're looking for something that wasn't there previously, take a look again, they might've been added to the mix. Probably this video mentioned will get mentioned next time around as well in 2024, but we're still working on processing those videos. There's just a lot of them. And I haven't watched this one yet, but there's a video around humanitarian AI.

There's a meetup I guess around this using IATI data, which is an acronym that I'm not familiar with and I'd have to try to guess that, what that is. But humanitarian AI itself, that sounds like an interesting thing because back to where AI is powerful, you can use it for a force of good, you can use it for force of evil. This is clearly going to be, I'm going to assume a checkbox in the force of good realm, like how can we use AI for improving humanitarian outcomes, disaster recovery or whatever it might be. I think that sounds very promising. I love that.

So upcoming in December, if you follow along with the Going Meta series that we have, it's every time awesome. I don't know how [inaudible 00:55:57] just keep, the content is always fresh and it's always engaging and always just really insightful. So there's new one of those coming up in December 5th. There's a virtual event for Discover Neo4j Aura, The Future of Graph Database-as-a-Service, a workshop that's going to be happening in December 5th as well. Some training out in Atlanta, Georgia. Neo4j and Google Cloud Generative AI hands-on lab coming up for you guys out there, so that's going to be pretty great. And then continuing in December, so gosh, is it like every day in December? December going to be busy.

Jennifer Reif: For the first couple of weeks and then we start getting into the holidays and then it gets quiet.

Andreas Kollegger: Okay. So this is the end of the year, like the mad rush, like, "Oh my gosh, we've run out of time. Let's get some new events in".

Jennifer Reif: Yep. Yeah, a couple of conferences. One in London on December 6th. There's also on December 6th the API Days Paris. So both events on December 6th, both in London and Paris, so you won't be able to hit both of them. And then a YouTube series for Neo4j Live, that's a community-run event by a colleague, Alexander Ertel. And he's looking at powering advanced Streamlit chatbots with GenAI. He always brings on a guest and they talk about that topic and show their work there, so that'll be really interesting. The conference from London that was happening on December 6th is also on December 7th, so a repeat of that. There's a lunch and learn for tackling GenAI challenges with knowledge graphs, Graph Data Science, and LLMs also on December 7th. On December 12th, there's a meetup for Graph Database Melbourne. So if you're interested and in the area for that, check that out. And then Andreas, you want to take back over for the last couple?

Andreas Kollegger: Yeah, I'll get through the last couple ones. So if you're out in Chicago, there's a conference happening on December 13th, the Evanta CDAO Executive Summit. Gosh, that sounds like it's going to be important. So out in Chicago and December 13th, try to get to that if you can. There's two installments for two different time zones of a connections conference that's happening for Generative AI and knowledge graphs also on December 13th, we'll have links for that in the show notes. And then we'll close out with some more meetups happening. There's a meetup for Engineering Kiosk Alps Meetup in Innsbruck. So I'm going to assume this is out in Austria, which, oh my gosh, December 14th, a meetup in Austria. I think we should all fly into that, that would be amazing. Innsbruck, we'll do a little bit of skiing, we'll talk about grass, we'll talk about AI. Had I seen this earlier, this would've been the highlight of the entire show.

And how delightful, right? And also a nice location to go to, Tampa, Florida Meetup for AWS re:Invent, a recap meetup around that actually, which sounds almost crazy that you would have an event like that. But AWS re:Invent is an entire week of just nonstop stuff happening and the idea of having a meetup to just be like, "Okay, what did we just see?" And it actually sounds pretty appealing, I would love something like that. So if you happen to be in Tampa, look for that meetup. That sounds like a good event to get together with people and try to sift through all the announcements that happened around AWS.

Jennifer Reif: Yeah. And after December 14th, sounds like we're all off for the holidays.

Andreas Kollegger: Perfect. Cool. Oskar, thanks again for joining us today. You and Tomaz have been the center of the storm of Generative AI within Neo4j and remaining calm and upbeat through all of it, which has been amazing to watch, and a lot of good output from you guys. So thank you for all the work you've done, and thanks for joining us today to share your view on stuff.

Jennifer Reif: Yeah.

Oskar Hane: Yeah. Yeah, thanks a lot and thanks for having us.

Jennifer Reif: And to everyone else, we will see you in year 2024. Happy holidays.

Andreas Kollegger: Happy holidays.

Oskar Hane: Bye.