GraphStuff.FM: The Neo4j Graph Database Developer Podcast

2024 Recap: Favorite highlights!

Episode Summary

Join the Neo4j tech advocates for a recap of the entire 2024 year - our favorite highlights, tools, events, and more! From major milestones in the industry to exciting events to favorite tools, catch all the high points from each of us. Then, hear where to find us at the beginning of the new year. We wish everyone a wonderful close to 2024 and a happy new year!

Episode Notes

Topics:

Trends/Events over the year
- GQL release https://neo4j.com/blog/cypher-gql-world/
- GraphRAG https://neo4j.com/blog/graphrag-manifesto/
Memorable events/experiences
- NODES 2024, obviously :) https://neo4j.com/nodes2024/agenda/
- GenAI Graph Gathering
- AI Eng World Fair
Neo4j product releases?
- New Aura console
- LLM KG Builder https://neo4j.com/labs/genai-ecosystem/llm-graph-builder/
- Neo Converse https://neo4j.com/labs/genai-ecosystem/neoconverse/
- Neo4j Python Rust extension https://github.com/neo4j/neo4j-python-driver-rust-ext
- Neo4j GraphRAG python library https://github.com/neo4j/neo4j-graphrag-python

Tools of the Month:

Jason: BAML https://github.com/BoundaryML/baml
Jennifer: Github actions (v3->v4), Spring AI https://spring.io/projects/spring-ai
ABK: Google Notebook LM https://notebooklm.google/
Alison: ChatGPT Canvas https://openai.com/index/introducing-canvas/, Anthropic Model Context Protocol https://modelcontextprotocol.io/introduction

Articles:Get Started With the Neo4j Aura CLI Beta Release

Get Started With the Neo4j Aura CLI Beta Release https://neo4j.com/developer-blog/get-started-with-aura-cli-beta/
Cypher Gems in Neo4j 5 https://neo4j.com/developer-blog/cypher-gems-in-neo4j-5/
LangChain-Neo4j Partner Package: Officially Supported GraphRAG https://neo4j.com/developer-blog/langchain-neo4j-partner-package-graphrag/

Videos:

NODES 2024 playlist https://www.youtube.com/playlist?list=PL9Hl4pk2FsvU6t-fXNeQfkpnmgMm4w5h3
Top 5 performing videos:
- KG Builder App: https://youtube.com/live/NbyxWAC2TLc
- Importing CSV Data: https://youtube.com/live/2iYTAgXM_ug
- Mastering GraphRAG: https://youtube.com/live/cbPII1Pam_M
- Entity Resolution: https://youtube.com/live/GMTY78xqGXQ
- Personal Knowledge Vault: https://www.youtube.com/watch?v=Q7E97TSmGyI

Events

https://neo4j.com/events/

Episode Transcription

Jennifer Reif: Welcome back, Graph enthusiasts, to GraphStuff.FM, a podcast all about Graph and Graph-related technologies. I am your host, Jennifer Reif, and joining us today are all of us tech advocates at Neo4j. ABK.

ABK: Hello.

Jennifer Reif: Alison Cossette.

Alison Cossette: Hello.

Jennifer Reif: And Jason Koo.

Jason Koo: Hello, everyone.

Jennifer Reif: Today we are going to be recapping 2024. We did one of these episodes last year for 2023, and so we thought we would close out 2024 with the same idea by recapping everything that has happened this year, which actually turned out to be a huge task for all of us to come up with what actually has happened this year. It's been a long and short year, we feel like. Does anybody have any thoughts they want to start with on that?

Jason Koo: I feel like we should start with GenAI and GraphRAG because that has been, I think, the big shift and the big driver over the year because it was something that GenAI and Retrieval Augmented Generation, or RAG, came up really strong at the end of last year. And we came into this year running. We immediately started working on example use cases for using RAG systems with Graphs and really helping our own community get familiar with the space because we ourselves predicted vectors are amazing, they're great, but they're going to hit a cliff, or they're going to hit a cap, and the only thing that makes sense after that is to leverage knowledge Graphs. So how do we jumpstart or get people familiar with this inevitability, right?

Jennifer Reif: As we chatted about the last episode, yeah.

ABK: I think you're totally right, Jason, and I think one of the interesting challenges of this year, and I'll contrast it to last year, where everyone had some idea pf how GenAI seems like it's a thing, but no one was really doing anything yet. But between folks like us who were trying to figure out what to do with it, and then also businesses who are trying to figure out what to do with it, everyone had on their agenda for this year, do something about GenAI. And for businesses, maybe it was say, let's put aside some budget for proof of concept or something like that. And then obviously for us, we're like, step one, how to do RAG at all, and then very quickly, as soon as you take that first step, it's like, that's cool. Then you take another step and another step, and now it goes down an entire rabbit hole of possibilities.

And so I think it's fair to say this year nevertheless was still in exploring what is possible as relentlessly every week, every day, maybe sometimes, but at least every week there's new papers coming out, new solutions coming out, new LLM models coming out on the entire stack of things that are possible from end user, consumer level, I guess, stuff, I'd say, all the way down to the research papers. The entire thing was just relentlessly moving forward through the whole year. And so while we're simultaneously trying to just define and describe and help people learn how to do things with what's available now, the now is shifting sand the entire year.

Jennifer Reif: I really felt that this year as well, that things moved so quickly and so fast there were new things available or new strengths and weaknesses, or people would start using it for this or that, and the ground would shift or the momentum would shift into some other thing, and something else would take the lead for a while, and then it'd shift again. And so things became really out of date really quickly, and just the amount of ground I think that the whole space has covered this year has been huge.

Alison Cossette: Agreed. There was a talk I was giving at the end of last year called Beyond Vectors, and the entire year was about beyond vector, right? It was all about nice starting point, but let's actually get into it. So it's just really interesting for me even to look at the talks that I was giving a year ago versus where we are now, where now you go to graphrag.com and you have everything about what's beyond the vector, everything about Graph and how you leverage it and where it fits into your space and how to understand it. I think the community, in general, and ABK, we're going to shout out to you here on all the work that you've done on this this year, is being able to have a resource like that at the end of the year when we were barely talking about it this time last year is just a testament to what everyone's saying of how quickly it's moving and how much it's grown and where it's even going to go.

Jennifer Reif: To follow up on that, too, I did notice I had one presentation that I gave a few different times this year, and every time I came back to it to prep it for the next event presentation, I modified quite a bit of it because things had changed so much, and I felt like the groundwork or the backdrop setup for people to understand what was going on and how to get to building something with RAG had changed so much. Having to reorient at least the first part of my presentation every time was a little bit unsettling to me, but also fun because it was never the same presentation twice.

Jason Koo: Drawing attention to the beginning of the year to now, one of the challenges that we encountered very early on in terms of building these initial GraphRAG prototypes was we kept encountering oh, getting data prepped and ingested for a Graph system was repeatedly a challenge. It was a great roadblock to, I think, getting other folks to start. And out of that shared knowledge and frustration, the LLM Graph Builder emerged. We just need a quick way to get people's own personal data rather than curated datasets that we had done in the past to quickly choose some URLs or PDFs, whatever your data source is, and quickly get to the point of, ah, this is what the end data of a GraphRAG system is going to look like. Even if initially if you don't give the Graph Builder some constraints, an LLM is going to create way too many entities and relationships.

But if you don't know how to start, that becomes actually a great starting point. Oh, these are the types of entities and relationships that appear most in my dataset. I can run this thing again with this constraint and get a cleaner dataset. And looking in retrospect, that was basically a year-long adventure learning curve of oh, yes, this is the easiest way to show someone how to get started without spending a huge amount of prep time.

Jennifer Reif: It's funny, looking back-

ABK: And it's-

Jennifer Reif: Oh, sorry. It's funny looking back at all the things that maybe seem funny now taking an entire year to get the hang of or to catch up on. Sorry, ABK, go ahead.

ABK: No, I agree with you. I think actually I was going to reflect on that point a little bit, that from a year ago until now, the things that we now assume to already know or hard-earned. It was new ideas that we had to figure out and test out, and having gotten past them, now to us, they seem simple and obvious. They were not simple and obvious a year ago, but now oh, obviously this is what you should do. You should chunk up your data in this way, you should do linked lists, you should do the summarization, all the things that we do that are baked into the tool, as you say, Jason. The LLM Graph Builder tool did a great job of condensing all that into one operation, but without that, I what you're saying, but it seems like a lot of work and I don't really understand what the point of all that work is.

We know, and I'll say even past that, I'll reference about a month ago we had this event we called the GenAI Graph Gathering, where we got together cross organizations between Neo4j and folks at Google, folks at Meta, folks at all kinds of places that are at the leading edge of doing stuff in RAG and Graphs and GenAI. And we got about 30 people in a room together, spent the day together. Basically it's a bunch of engineers in a room together sharing notes, because we're all trying to push the envelope. We're like, we're going to go off and build products and all that stuff is going to happen, but there's a baseline we all want to be able to just accelerate and agree upon together that any collaboration we can do on things that are common are going to help us all then in the differentiated things we're going to do.

And one of the comments I ended up repeating on that day was, "Let's all just keep in mind, everybody here in this room is living in Tomorrowland. Outside of this room, nobody has any idea what the hell we're talking about." That we are so deep into this that the things we assume that we have learned, hard-earned learning over the last year, it takes time for people to get up to that, particularly for some people, and I'm one of those people actually. A year ago, I would not say that I programmed in Python. I will say it now, but not in a very strong way. I can code in Python. I don't think I'm a Pythonista, but there's a lot of people who are like that. When we did our developer survey, that showed up, that lots of people learned Python so they could do GenAI. So collectively, we ourselves, the other folks that we work with, our customers, and then certainly I think everyone in the tech industry, this was just a giant year of learning, I think is probably one of the takeaways for me.

Alison Cossette: The other thing that was exciting about this year is I don't know that in my lifetime there's ever been a moment, at least for me, where we're part of actively creating the standards for how people are going to do things in the future. Certainly there have been moments, the cloud moment and the big data moment and these other types of moments where I talked to people in industry who were in those rooms at that time. And it's just exciting for me to have us be in one of those moments right now where, like you said, at the Graph Gathering or what's coming out now about retrieval patterns, being at the moment of definition, it's super exciting.

Jason Koo: Speaking of standards, the Graph Query Language, GQL, came out earlier this year, and so I was trying to look up, I forget what part of that spec includes an extension to SQL to allow it to better query Graphs. I can't remember the name extension of it, but when you look at it, it's interesting because basically a mix-

ABK: PGX.

Jason Koo: PGX? Okay. It's basically a mix of SQL and what looks like Cypher, and so you're using Cypher-esque syntax to do the pattern matching that you would normally do in a Graph data store. So yes, anyways, thought I'd riff off of standards.

Jennifer Reif: To talk a little bit more about the GQL release, I guess that went live in April, so it is officially out, and there's some work being done to migrate and incorporate some of the things that came out of that into various query languages, Cypher being included in that. So for anyone interested, check out the links and sources for that.

Jason Koo: And definitely be mindful that it's going to take the industry some time for everyone to be on the same page. So right now, I don't believe you can query both in GQL and Cypher for Neo4j. It's still going to be Cypher. Cypher is still being updated to also include GQL features. And I believe down the road, at some point, either we or a community driver will come out so that you could query Neo4j using GQL, but this is probably going to be the same for everyone in this part of the industry. It's going to take a bit of time before you get to that point.

ABK: I think in the interim, as you're saying, there'll be a bit of a mix across anybody who's supporting the language, where there'll be different levels of conformance, like how much of the spec that you actually support, everyone who's going to have the basics, there'll be pattern matching. We can probably check that box off. If you don't have pattern matching, I don't know what you're doing, you're not doing Graph stuff. So there's at least going to be that. But then there'll be subtle things like keywords that are basically aliases for what Cypher today uses that are functionally the same in GQL, but they just pick different words. So you'll be able to use, I think, either word, if you're using Cypher for instance, you can use either one and it'll be just fine. And as I understand, some of the plan for Neo4j would be that the Cypher interpreter itself will basically have things, eventually it'll be able to have a strict mode.

So if you only want to use the GQL variant of what's possible, you would only use that. Otherwise, you can just use whatever works. And probably that's going to be the case everywhere. It'll be, for a while, hopefully it won't end up being quite as much of a mess. I apologize, but a mess, as SQL is, it'll hopefully be a little bit more common. The common set of what's possible in the language will be true across all vendors rather than a very small subset and lots of differentiation across vendors, which is the case a little bit in SQL. Hopefully it'll be a little bit better in the Graph query language family, I guess I'll say for now, but we'll see how that plays out.

Jennifer Reif: Well, we spent the last episode recapping the biggest Graph event of the year, NODES 2024. So that was a huge milestone, I guess, for Neo4j and highlight of the year, I would say, for 2024. I think that there was a lot of good content and some really exciting things going on. Just some of the intro sessions, the closeout sessions, we talked about the fireside chats and the fun and engagement as well as just informative chat session that was at the close of each region, so a lot of fun to be had for that. So the video playlist is out if you want to catch that. And then, of course, prep for next year. It'll be back next year, so definitely check that out. Does anybody have any final thoughts on NODES before we just point to the previous episode?

Jason Koo: No, other than that you should totally check it out. Lots of great insights. I know from 2023, even throughout the year 2024, I was still occasionally checking out different sessions from '23 because there was just such a wide variety of stuff and just so much good content. And we're all busy, but sometimes it'd be like, oh wait, "I'm going to a certain event" or "I'm going to be talking with someone," and "Oh, there was a NODES talk on that thing." I'll go back and it's perfect. There was actually someone who had done a session talk on exactly that subject matter.

Jennifer Reif: I do think it was interesting too, we mentioned this in the last episode, but how much the content had changed over the last year, too. You saw that reflected in the NODES presentations and even just the submissions that we got.

Jason Koo: Yeah. There was a whole track dedicated to the changes too, but even then it permeated to every track.

Jennifer Reif: Yeah, for sure.

Jason Koo: Definitely check out NODES, if you haven't already. Speaking of events throughout 2024, does anyone else have memorable events that they have been to? For me, Google Next really stood out to me for a couple of reasons. It was at Google Next I went around and I had this mission of, you know what, I'm going to go meet all the other dev advocates working the floor and just say hi and figure out if there was any future partnership things we could do. And since then it's set off this chain events of joint events that at least I've been involved with for the whole rest of the year past that. And also just randomly one of my childhood friends, she works at Google, and prior, I hadn't really thought to reach out to her. But during the conference when people were just moving in and out, just accidentally ran into her. And it was like, "What are you doing here?" So it was great to catch up and just enjoy these serendipitous encounters at in-person events that really, I think, came back in strength over 2024.

Jennifer Reif: I will say that from personal experience, I did a lot more traveling this year, which part of that was personal life situations stabled out a little bit for me and it allowed me to do that. And then part of it I think was just the amount of events coming back and really thriving this year was much, much greater. And so here were a couple of smaller events I actually attended this year that were new to me at least. I saw a lot of friends and colleagues from other events cross over into those, and it was just really interesting to connect with the different audiences and reconnect with a blend of other conference groups that were there. And so that was a fun new thing to do for this year. And then, of course, hit all the major events that I always love and try to hit every year.

Alison Cossette: As my family will attest, I went to many events this year. They're actually happy I'm home right now. But what was interesting to me, to your point, Jen, is just the variety of environments that we were able to fall into. Everything from in the early year, one of my favorites was Microsoft Fabric, I thought was a really great event for us because it was one where people, information architects, similar to data scientists, they'd get Graph really easily. And so it was really interesting to be able to be an environment that was so fertile for conversation of how it fits in really logically.

But also I was at everything this year from DevAI to I was at a Smart Environments conference in Montreal a couple of weeks ago. I can't even think of all the different things. MLOps and GenAI World was great. There were some really interesting conversations that came out of that about how Graph fits into your MLOps scenario. I just found myself in really unique and disparate environments that were new, and I really enjoyed being able to bring that Graph insight to these different types of communities to see where it was going to go. I feel like the scattering and the spread was a much wider area than it had been in the past. It was the breadth of the events and not necessarily one in particular that was really interesting for me for this year.

Jennifer Reif: That is one thing I think pops up that I noticed this year throughout 2024 is Graphs previously were very much a data thing. They were a database. You talked about it in data circles and database circles and not really traditional enterprise events, but more of the straight uniform verticals type of stuff. And I've seen this year, with GenAI, Graph cross outside of those data boundaries and really become accessible in, like you said, Alison, all of these different types of verticals, industries, data, life sciences. And we've been in those spaces before, but it hasn't been a huge draw like I really felt like it was this year.

Alison Cossette: I also found that I got invited to a lot more things than I necessarily had. Usually it's you send in your conference talk proposal and cross your fingers and, "Ooh, will they pick me? Do they like me?" And now it's like, "Hey, can you please come talk about this? Can you please come talk about that?" So I think that's part of the reason why my year was so busy, too, is there was a lot of invitations. That was newer for me as well.

ABK: I love what you're all saying, the GenAI aspects through all this stuff, obviously that's the theme of the year. And with that, two aspects. Every place I spoke, whether it's a meetup or a conference, whatever the context was, if I'm talking about GenAI, then it's going to be well received or I'm going to be invited so that I can talk about GenAI. I definitely experienced all that as well. And then I'll focus on one particular AI event, which was the AI Engineer World's Fair middle of the summer. I want to say maybe it was August, something like that. And this is the conference that was spun out of the blog post actually that Swyx had, I think maybe it was a year ago, a year ago June or something. He had this blog post about the rise of the AI engineer.

Swyx is this podcaster. He is from Latent Space. He blogs as well. And he wrote this blog post about just observing that with stuff happening in GenAI, that there's all these new skills that people have to have, that obviously it's prompt engineering, but beyond prompt engineering, you're still going to have to write a little bit of software. You're probably going to have to do a little bit of MLOps. You're probably going to have to do a little bit ... The mix of things is a new mix of things, and in retrospect, it's an obvious thing. But he coined it as this is the AI engineer. This is his term, which is genius. And then he had this conference in the summertime, which was absolutely amazing. And if you're doing GenAI stuff, I think it's probably the place to go to. It's in San Francisco, and it was an amazing event.

And for us, we had through the first half of the year leading up to that had been, of course, talking about Graphs and GraphRAG and how you do that with GenAI. And we had some idea, of course, because we've been getting invites and getting good response to our pitches for talks at conferences that GraphRAG has got some appeal. And then we hit that conference, and every workshop, every session, every talk that we gave was packed, and the place was packed anyway. But the thirst, the hunger, the desperation ... No, no, just the interest, the level of interest that people had in listening to what we had to share with them was gratifying and humbling because you never really know until you hear back from people. People are like, "Tell us the thing that you're talking about because we want to hear the thing."

And I've always enjoyed giving talks. It was probably the most ... I don't know, I felt accepted and in a way that maybe it was beyond even whatever has happened before. There's always a level of understanding amongst engineering, like you hear what I'm saying, you understand the concepts, and that feels really good. But this was a little bit beyond that because it was such a new frontier that we're charting and people were like, "We like where you're headed with this stuff." So that was certainly for event-wise the highlight for me.

Jason Koo: Kind of-

Alison Cossette: Just a quick follow up on that. Oh, sorry. Go, Jason.

Jason Koo: So just real quick, so off of that, in my walks of talking to other vendors at conferences and just people in general, it's amazing now, ever since AI Engineer's Fair, they're always bringing up the fact people keep asking me about GraphRAG. And this is from other vendors at other booths, other technologies that make sense for us to work with in the future. But in talking with them, it was always interesting how many had mentioned, "Everyone keeps walking up to me asking, do we support GraphRAG or do we have a future integration with some sort of GraphRAG system?" And so that's really just come up a ton, especially in the second half of this year. So Alison?

Alison Cossette: Oh, I was going to say AI Engineer World's Fair was certainly a moment, and I think the next big moment was at The AI Conference in September. That was another one where I feel like the GraphRAG story was again really relevant and really present. Our CTO, Philip, gave a talk the first day about the overview of it. And that room was huge. What was the capacity in that room? It was like hundreds. There must have been 600 people in that room. I don't even know. It was huge. And then I did the practical getting started the next day, and it was still really busy. It was still really packed. And our booth was flooded afterwards because again, at that point, the GraphRAG story had really gotten its hooks into people. People knew that their RAG applications needed to evolve.

And when we think about it from this business perspective, building chatbots is great, but where's the ROI? Where's the business impact? How is this moving the company forward? And the GraphRAG element really drives that home in a way that I don't know that people understood right away what that business impact and that ROI was going to be when you can bring in that context. So I think that was the other really big shift was not only the awareness of GraphRAG, but the monetary value of implementing GraphRAG over just oh, let's talk to our documents.

Jennifer Reif: Seguing, I think, from events into maybe just what all happened at Neo4j this year. So what were the new things, the latest and greatest that Neo4j was able to provide to the space this year in product releases?

Jason Koo: I can speak to a few of the Python updates. So we released a Python Rust extension, which dramatically improves or speeds up the query response time for a number of things. So the speed-up varies depending on your dataset and what type of questions you're asking, but anywhere from 3 to 10 times faster. If you're still using the original Neo4j Python Bolt package, definitely looking into adding the Rust extension, which automatically will pull in the original as well, too, I believe. And it's just a drop-in replacement, so I don't think you even have to change your import statement, if I remember right. But you definitely don't have to change any of the API calls, but you just basically pull down that dependency, and you get that automatic speed improvement.

Jennifer Reif: That's awesome.

Jason Koo: Yeah, right? And the next thing is we released the Neo4j GraphRAG Python package fairly recently. So this provides a couple of APIs that abstract some complexities of generating a knowledge Graph building pipeline. So in just basically a few lines of code, you can run this process. And it includes retrievers to work with a number of vector data stores. I believe the three that are supported are Weaviate, Pinecone, and Qdrant. So if you're already using or plan on using one of those vector stores, you can set that as a retriever with the GraphRAG package. So what that will do is it will use one of those vector stores for storing the vector embedding, but all the Graph-related content goes to Neo4j. So through this one package, you're using actually two different data stores to make use of the strengths of both systems. So definitely check that out if you are a Python developer looking into implementing GraphRAG pipelines.

Alison Cossette: One thing Neo4j is ... Oh, sorry.

Jennifer Reif: Nope, go for it.

Alison Cossette: Again, data scientist. One of the other things that I thought was really interesting this year was GDS on top of Snowflake and being able to actually use that relational ... Snowflake is everywhere. But having folks be able to actually access the Graph data science algorithms on top of relational, I thought, was another really great thing to share the Graph story, that there's certain algorithms that only run on Graph and really highlight those relationships. And so I think as a starting point for people, it's been a really interesting way to show the power of Graph and show the power of those algorithms from something that's obviously very relational.

ABK: Alison, would you mind, for a non-data scientist, expanding on that a little bit? So is GDS running somewhere separately? Is GDS in a Snowflake environment? What's the setup actually like?

Alison Cossette: Yeah. Can I answer that question accurately? My understanding, and I've only just started working with it, so I apologize, I will get you some more detailed answers. But basically for anyone who's familiar with GDS, Jen, I know you were just talking about this recently, what is a projection? So the idea is that you take what's in the relational database, and it projects a Graph that you can run the algorithms on, but it's not an actual Neo4j database instance. It's a projection into the GDS space that then you can ... A lot of times what folks will use that for is creating new features in traditional machine learning algorithms.

So is it that a page rank could be an important predictor of, I don't know, product popularity? Or how can we use it for different types of recommendation engines? Or there's lots of different ways that it can be used. So it depends on what your use case is, but think about if you had the opportunity to run Graph algorithms and then retain that for whatever that use case may be without necessarily starting ... Again, I would prefer that you have the actual database, but I think it's a really ... We talk a lot about how can we show the power of Graph, how can we show people what's possible, and what can you understand about your data when you look at it in this connected way? And so I think it's just a really interesting way for folks who may not be Graphistas to start to think about how you can leverage Graph in new ways.

ABK: I love this because as computer scientists, there's always somewhat of a question of, do you bring your data to the thing that's doing the processing, or do you take your processing to the thing that has the data? And okay, Snowflake's big and unwieldy. You're not going to jam all that over into ... You could compress that all and put that into a Graph, I suppose, but you probably don't want that. Because as you're saying, the utility here of GDS is to do the algos. And so you bring the compute, which is GDS, over to Snowflake, project from Snowflake into that environment a Graph, run the algos, and then I guess save results back to Snowflake. Does that sound right?

Alison Cossette: Or use them in an application. What you do with that is open. Is it that it's something that sits on top of it and gets projected into an application? Is it something that gets put back? There's any number of ways that you can leverage that. But that happens a lot. I know people say to us, "Well, do I have to put everything into Neo4j to actually get utility out of this? How do we do that?" And my recommendation is always to start small, and I think this is an interesting way to start small because Graphs are certainly inevitable, but-

Jennifer Reif: And I think it's a big ask too. Data is a lot, and putting something into a database, no matter what that database happens to look like, includes a lot of design work, a lot of requirements work and sign-offs and all that stuff. And depending on your organization looking at something new or saying, "Hey, we need to rip and replace," is just not feasible. So I think meeting developers where they are and providing some of these capabilities while still realizing that it's not practical to rip and replace every time. So providing these capabilities, and then as they maybe gain benefit or as they continue to use, that tipping point shifts, and then it may become feasible or even necessary to switch and move to different technologies or different tools. So I think it just gives an opportunity to say, "Hey, here's your situation. There's no need to change that, but let's see if we can add to this and see if we can provide additional value," knowing that you can't just rip and replace on a whim and risk these new situations that are huge for businesses.

ABK: Maybe that's the other catchphrase for Graphs for next year. We'll have Graphs are inevitable and Graphs are a yes-and. Fair enough?

Alison Cossette: 100%.

ABK: Yes-and Graphs.

Jason Koo: Which has really been a great part of this whole retrieval augmented generation story with Vector. I think at first, Vector is a completely different thing. But it's oh, wait, no, you can do both, which is, I think, also one of the great learnings of this year was oh yeah, they do work better together. And so now we've got this great playing, working together story with many other vendors and not be this, "Oh, you've got to choose one database or the other." It's like, "Oh, you know what? You should use all the databases."

Jennifer Reif: There's a reason each of these technologies emerged, and that's because something or the existing technology space did not provide the capabilities they needed. And so therefore you create this new tool, this new technology, this new data store in order to fill that gap. And that doesn't mean that there are no other gaps, and that database solves them all. It just means that depending on what you're doing, you'll have different gaps for that, and different technologies and tools will fill those very unique gaps.

ABK: Totally agree.

Jason Koo: Totally. Should we jump into tools of the year, tools of the month? Oh, no, what happened to Alison?

Alison Cossette: I'm back. I'm back.

Jason Koo: Oh, there you are. Just wanted to shift panels?

Alison Cossette: I was lagging. I had to switch over. I was lagging, so I had to shift over the internet.

Jason Koo: Oh, yes. I think we're all lagging to some degree, which is why I'm like, "Should I pause longer, talk sooner, talk later?" It's all worked out. Tools of the month and maybe tools of the year, might as well do that.

Alison Cossette: Ooh, tools of the year.

Jason Koo: I guess I'll start with my tool of the month. So BAML from Boundary, which is a technology that Ben Lorica mentioned during his keynote. So the easiest way to describe BAML is a framework or tooling to get structured output from unstructured input. So basically a BAML file consists of basically two components. One is the data model, which is very, very similar to Pydantic in Python. So you have these very strict models. Strict-ish. You can give it some flexibility. And then the functions that define the prompt engineering, the prompts that you want to run. And that's it for the BAML files. Once you've got the BAML file ready, what you can do is make a function call to those prompt engineering functions that you defined in the BAML file, and it will construct output based on the Pydantic-like models that you had specified.

So for a demo with them a couple of weeks ago, what I did was I just took their FastAPI demo, which FastAPI is for quickly creating Python servers. And so this endpoint, I would just give it a URL of a web page that had agenda items. So I used our NODES agenda to test with. And so the underlying LLM that you choose, in this case I was using OpenAI, would then look at the page, parse it through and look for data that matched that data model. In the prompt engineering, all I said was, "You are pulling out event information and just returning that."

And so it would go through and it would find the speakers, events and all the stuff that I had modeled, and it will only give you output that matches that with the data filled, whether it's required or optional. So when you ask other LLM RAG systems, you sometimes get a JSON output that you expect, sometimes you get a list, sometimes you get texts. There's always this prompt engineering foo you have to do to try to force it to give you the data you want. But BAML has created a internal workflow that's strictly only outputs what you're looking for. It either gives it to you, or it gives you nothing.

So if you're asking for a list of these objects, it'll give you an empty list if it doesn't match, or it'll give you the items that it finds. And it's quite quick. And once you wrap your head that you have to create this data model portion, then it all ingests very quickly, very easily. So it offers that intermediary step. So our LLM Graph Builder, you give it input, and then you get the output right away, more or less. BAML offers that intermediary where you can define the exact output and control that layer. So anyways, that was super long, but that's my tool of the month. Anyone else got a favorite tool of the month or year?

ABK: I'd love to...BAML, I've heard about this a couple of times now. Ben has mentioned it, Ben Lorica, and it seems really intriguing to me. And what I've understood about part of the magic, I'm just going to riff off that for a second, I apologize. But they've realized that, as you said, part of the challenge is the prompt engineering, but then also basically what amounts to post-processing. And it sounded like they basically attacked that part of it. They're like, "Hey, whatever we get back from the LLM, we know that it's tried to give back useful information that matches what we've asked for." And that they have some post-processing that they do that they're like, "We have a target defined by the schema. We're going to go ahead and try to fill that schema with what we've been given."

And so that's the extra bit of work that's in the BAML. So it's the specification, then the post-processing. From what I've heard, they do such a good job with that, that you can use smaller models or more specific models and get equal or better results than you would have with a large model. So it's nice to set up, it's nice to work with, gets good results and takes a lot of the heavy lifting out of the scene, which I love. I'm going to plus one BAML, first of all, even though I haven't even used it yet. If it's as good as advertised, I'm pre-sold on it, I guess I would say.

And maybe this isn't a tool of the year, but the tool of the month, I guess, for me. And I'm a bit late to this party, so I apologize for folks who are already there. I've only recently started using NotebookLM. I'm like, "This is great. This is the kind of way I'd like to work." With having chats with ChatGPT at some point is just annoying. I'm scrolling up and down all the time and copying from here. I just want to be able to save snippets of stuff, ask to do bits of work, and then use that for ultimately assembling some output, a result. And because I happen to live in a Google Docs-connected world, the integration with YouTube videos and Google Docs and everything else is so pleasant. It's like, "This is great. This makes me very happy." All my stuff is there, and I can just do work with it. So that's my tool of the month. It's consumer level, not as engineering-y as yours, Jason, but that's where I'm at right now.

Jason Koo: Nice.

Alison Cossette: I'm going to follow up on that one with ... oh, sorry.

Jennifer Reif: Nope, go for it.

Alison Cossette: ChatGPT 40 Canvas. Can we just talk about Canvas? Love Canvas. That's probably my tool of the year. I love, love Canvas. So for those of you who don't know, basically you have your ChatGPT 40, but it creates canvases, which are sub-docs. So I use it a lot if I'm working on coding, like, "Oh, can you update this piece, update that piece?" Or maybe I want an overview document, but then I need something that's longer and more detailed. I can work on those simultaneously within the different canvases within the chat. I've used that a lot this year. That's probably one of my tools of the year. Again, not super technical, but it's been a great tool for me.

My tool of the month, actually, I just heard about yesterday, and it's what I'm most curious about right now from Anthropic, the Model Context Protocol. Has anybody heard about this?

Jason Koo: Tell us more.

Alison Cossette: What they're trying to do is ... I know it's super exciting. It's super exciting. So basically what it's trying to do is give you a way to talk to a variety of your different APIs or different connectors to bring data back for context. What I'm playing with right now, again, that just came out is pulling in something from a structure, from a Postgres and then putting that into Graph. So having those multiple databases live next to each other, and then being able to run the GraphRAG on what's been pulled out, the combination of those. That's what I'm playing with right now and really curious to see where that goes and if it gains roots. But the Model Context Protocol from Anthropic I'm super curious about.

ABK: I'm going to Google that straight away.

Jennifer Reif: My tool of the month isn't necessarily a tool of the month, I guess. It's a really tiny, small thing. But one thing I really wanted to call out is that I got a notice in my email that GitHub Actions was deprecating version 3 and moving to version 4, which means that anytime you're doing an upload artifact or download artifact, it shuts off December 5th, so it's done. You have to migrate to the new thing. And the backbone of my website is using GitHub Actions. So I'm like, "Well, I need to go out." And I always tease every year that there's one big thing on my website that it should be a five-minute fix, and it always ends up taking a week because of all the things that break or other things I didn't realize or have to fix or test or that sort of stuff.

But this V3 to V4 for GitHub Actions was seamless on the upgrade. I went in, I fixed a version in my YAML file, I pushed it to GitHub, and it built, and success. And I went out, and I tested my website. Everything was there. It worked. It was just the simplest user experience ever. And I don't know what they did or what all happened.

Alison Cossette: Yay, GitHub.

Jennifer Reif: Yes, what all work they put into that, but to have that kind of user experience for an upgrade and especially a deprecation upgrade was phenomenal to me. And as you said, Jason, the Rust package, where there's nothing that breaks when you add this dependency, that's just all you need in order to take on those benefits. I wish more developers and more tools and more products would do that sort of seamless experience where it's just so easy to just drop it in and use it right out of the gate.

Jason Koo: That's the way it should be, right?

Jennifer Reif: Yeah.

Jason Koo: That's amazing.

Jennifer Reif: That's probably my tool of the month. I will say that over the course of this year, and of course all of you are probably going to be like, "Oh, there she goes again with Spring." But the Spring AI project has made some major strides this year, and I'm just always impressed at the amount of work and the involvement on the Spring and the Neo4j side of things to move that project along and to incorporate user feedback. There's been a lot of things that I've brought up that it's like, "Hey, this doesn't make sense," or "Hey, this seems to be missing." And it's just like, "Okay, we'll put it in for you." Or "Here. Here's the thing. You can put it together and submit a pull request." And so it's just a really engaging, thriving community there. Spring AI is not GA released yet, so that's a hindrance there maybe to some. But it is a really interesting, fun way to incorporate AI into Spring projects and just a really engaging, fun project to be involved with.

Jason Koo: Nice.

ABK: And much respect to Spring. I think you're totally right. But for a project that is as long-running as it is, it is so high quality, great people, great community, and don't need the latest, greatest, let's write from scratch an entirely new thing. No, let's just keep adding on and adjusting, and it's amazing. Good stuff.

Jennifer Reif: Are there any upcoming things that all of us advocates will be involved in, events and things? I know for me, as far as presentations and conferences, I will be doing a Java user group tour in Florida USA just after the new year. So that's really my only thing upcoming.

Jason Koo: I have some events for January. By the time we release this episode, they should be on the calendar, but I'm doing some joint events in San Francisco with Remediate, Diffbot and a couple of other companies. So looking forward to that at the end of January.

ABK: My diary for January is open right now. I haven't committed to anything. I have some options that I have not followed through on yet. But in December, so we're recording this in November, so December 12th, that'll be my last event for the year is Connected Data London, which is a knowledge Graph intensive environment. I'm really looking forward to that. That should be pretty great. And then I do expect Q1 next year to just quickly fill up. I think it looks like it's an open calendar right now. I know that's not actually true once January hits.

Jennifer Reif: Yep, too true. Alison?

Alison Cossette: I am happy to report that I do not have any live events on my calendar for January. Most people who follow me know I have literally been in, I feel like, many time zones this year and was fortunate enough to connect with so many people around the globe. So in January, I'm going to be detoxing from travel, but really looking forward to setting up my calendar for next year. So you may not see me in January, but you will likely see me very shortly after that.

Jennifer Reif: Sounds great. And as always, we'll include in the show notes all of the current articles, videos, and events that are going on that are sitting on calendars and in spaces where hopefully we'll point you to them. But we hope you have a wonderful close to your 2024 and beginning of 2025. Thanks, all, and talk to you again soon.

Jason Koo: Bye, everyone.

ABK: Bye.

Alison Cossette: Happy 2025.