Open AI's Sora, Gemini 1.5 Pro, and Practical Applications of Generative AI in Business [Sidecar Sync Episode 18]

Written by Mallory Mejias | Feb 22, 2024 4:35:20 PM

Timestamps:

02:13 – Open AI’s Sora release and it’s notable photo realism and video length capabilities

21:06 – Gemini 1.5 Pro’s announcement and it's expanded context window

35:46 – Practical Applications of Generative AI in Business and the Wall Street Journal article “How Companies Are Starting to Use Generative AI to Improve Their Businesses”

Summary:

In Episode 18, Amith and Mallory discuss Sora, OpenAI's text-to-video model, and Google's Gemini 1.5 Pro, highlighting their advancements in AI technology, as well as practical applications of generative AI in various business functions, emphasizing the importance of integrating AI for significant productivity boosts and the need for strategic planning and early adoption to harness AI's full potential.

Listen or watch below—and see below for show notes and the transcript.

Let us know what you think about the podcast! Drop your questions or comments in the Sidecar community.

This episode is brought to you by Sidecar's AI Learning Hub. The AI Learning Hub blends self-paced learning with live expert interaction. It's designed for the busy association or nonprofit professional.

Follow Sidecar on LinkedIn

Other Resources from Sidecar:

Tools mentioned:

Topics Discussed:

More about Your Hosts:

Amith Nagarajan is the Chairman of Blue Cypress (BlueCypress.io), a family of purpose-driven companies and proud practitioners of Conscious Capitalism. The Blue Cypress companies focus on helping associations, non-profits, and other purpose-driven organizations achieve long-term success. Amith is also an active early-stage investor in B2B SaaS companies. He’s had the good fortune of nearly three decades of success as an entrepreneur and enjoys helping others in their journey.
Follow Amith on LinkedIn.

Mallory Mejias is the Manager at Sidecar, and she's passionate about creating opportunities for association professionals to learn, grow, and better serve their members using artificial intelligence. She enjoys blending creativity and innovation to produce fresh, meaningful content for the association space. Follow Mallory on Linkedin.

Read the Transcript

Disclaimer: This transcript was generated by artificial intelligence using Descript. It may contain errors or inaccuracies.

[00:00:00]

Amith Nagarajan: this is why a lot of times when we talk about AI and safe use of AI. how critical it is to understand that really the only thing that can keep up with AI is AI

Welcome to Sidecar Sync, your weekly dose of innovation. If you're looking for the latest news, insights, and developments in the association world, especially those driven by artificial intelligence, you're in the right place. We cut through the noise to bring you the most relevant updates, with a keen focus on how AI and other emerging technologies are shaping the future.

No fluff, just facts and informed discussions. I'm Amith Nagarajan, chairman of Blue Cypress, and I'm your host.

Amith Nagarajan: Greetings, everyone, and welcome back to another episode of the Sidecar Sync podcast. Uh, Mallory and I are excited to be back. There's been so much going on in the last week. We've got a lot of updates for you and a lot of exciting topics. Can't wait to dive into that. But before we get into the [00:01:00] podcast itself, let's just take a moment to hear a word from our sponsor.

Mallory: Today's sponsor is Sidecars AI Learning Hub. The AI Learning Hub is your go to place to sharpen your AI skills and ensure you're keeping up with the latest in the AI space. When you purchase access to the AI Learning Hub, you get a library of on demand AI lessons that are regularly updated to reflect what's new and the latest in the AI space.

You also get access to live weekly office hours with AI experts. And finally, you get to join a community of fellow AI enthusiasts who are just as excited about learning about this emerging technology as you are. You can purchase 12 month access to the AI learning hub for 399. And if you want to get more information on that, you can go to sidecarglobal.com/hub.

Mallory Mejias: As Amithh mentioned, we have a lot lined up from this week. And last we are diving into three topics today. The first of which is Sora, OpenAI's new text to video model. Then we're [00:02:00] going to be talking about Google's Gemini 1. 5 Pro. And finally, we're going to wrap up this episode with a discussion around generative AI and business.

Use cases. Amithh, are you ready?

Amith Nagarajan: I am. Let's go.

Mallory Mejias: All right, so last week, OpenAI introduced Sora, a text to video model that stands out for its ability to generate photorealistic scenes with high visual quality and fidelity to the user's prompts. Sora excels in creating complex scenes with detailed characters and backgrounds, showcasing a deep understanding of language for accurate prompt interpretation and vibrant character emotions.

Now, one of Sora's weaknesses is his struggle with accurately simulating physical interactions in a scene, such as a character taking a bite out of a cookie, but the cookie not showing a bite mark afterward. This highlights the model's current limitations in understanding and representing cause and effect relationships and detailed spatial dynamics within generated videos.

Sora also surpasses competitors like [00:03:00] Runway in video quality and length, for now, we say. Competitors like Meta, Stability AI, and Pika Labs offer similar AI video products, with Adobe and Mid Journey also said to be entering that space. SorAIs not currently available to the public, but it is accessible to select red teamers for harm assessment, and creative professionals for feedback.

Amithh, what are your initial thoughts?

Amith Nagarajan: I've got a whole bunch of, I mean, you know, it's interesting because, uh, when Sora dropped, it was the same day that Gemini 1. 5 was announced. And, uh, you know, it certainly isn't something that just by happenstance, you know, OpenAI, you know, just happened to hit publish at that moment in time. It was clearly pretty deliberate, which I think is just kind of both funny and interesting, I guess, in a way.

Um, but I think open AI probably has a lot of tricks up their sleeve like this that are waiting to go for when someone else has something cool to say in the world of AI and and that, you know, Gemini one five is a really big deal. We'll get to that in a little bit. And Sora, you know, certainly is an [00:04:00] exciting announcement as well.

But let's talk about that a little bit more detail. I mean, to me, the exciting thing was the photo realism of the videos, the length of the videos It's worth noting for just a moment why that's important. It's not so much just the length itself, it's that the video has to make sense for the entire duration.

And so, the AI has to be able to kind of put together this entire set of concepts, essentially, of what's happening to One or more characters in a scene in the background and all the other things that are going on. And so it's more than just saying, Oh, well, you know, runway could do six seconds. So we're just stitching together 10 runway videos.

It's different than that because it has to actually make sense. And that's, I think, one of the most significant innovations brought to you by Sora. As you pointed out, Mallory, it's not yet a product that's available. Things tend to move pretty quickly, so I would expect that it would probably be something.

Available at least in a preview release relatively soon, I guess, in the next few months, but I'm purely guessing there, but to me, it's exciting [00:05:00] because it represents really an AI that has a much more profound understanding of the real world than what we've seen before, because the videos to generate videos that realistic.

Um, the AI has to have greater innate understanding of physics and of kind of the natural world around it. So I think that's an exciting development.

Mallory Mejias: I'm thinking with image, AI powered image generators specifically, I mean, even now, if you ask ChatGPTDolly3Midjourney to create an image that has some sort of text on it, even now it will still get words wrong or misspell them. So I'm wondering kind of how did we make the leap from having image generators that can't properly spell text to having minute long videos that make sense?

It just seems like a big gap there in my mind.

Amith Nagarajan: Sure. One of the things that's so hard to put our brains around is this exponential change that we're involved in right now, where, you know, we're at the part of the curve where it's basically vertical and the pace of change is happening so fast that, [00:06:00] you know, go from six seconds to one minute.

Yeah. And then to say, okay, well, what about like, you know, a 30 minute video or a feature film, you know, something like that, like what's the gap? Well, clearly , the gap is massive right now in terms of the difference, but the time scale to get from where we are to much longer videos and videos that include audio videos that include a whole lot of other assets like text, of course, .

Uh, it's it's not gonna be very long. It's going to continue to compound. So, um, there's a lot of reasons for that. I think that, you know, understanding the nature of these models being multimodal, um, is really important. But, uh, you know, it's ultimately the same things we've talked about repetitively here, which is you can't really prepare for a future when you think about things in traditional terms or in linear terms.

Uh, because stuff is happening so much faster than that.

Mallory Mejias: When we talk about these AI models, we talk a lot about this idea of next word predicting. I'm sure that's a little bit different, or maybe it's not all that different, when you're talking about text to video models. Can you give a high level [00:07:00] overview of how text to video models work?

Amith Nagarajan: Sure. Well, the innovation that Sora has behind it was described in a paper on the OpenAI site. And essentially, they talk about a combination of a diffusion model and a transformer model. And the transformer model is the model type behind, uh, what you see in, you know, all the language models. And what they've essentially done is been able to take the concept of what language models do, but apply it to.

chunks of video. They call it patches and these patches of video are used to train the AI and so at scale with enough video, the neural network is able to learn these patterns. And, you know, much like the next word prediction, you know, it's kind of like the next patch prediction is what's happening where it's trying to create the next little bit of video, which

my understanding is it might be a single frame or it might be a group of frames of the video. Um, and that process continues and, , there's some iterative, you know, thing that's happening to keep predicting, what you want relative to the prompt. [00:08:00] Um, and so Sora also has the capability to take in concepts like, , source images, not just text.

And so it's multimodal from that perspective where I could say, Hey, here's a photo of me speaking, create a one minute video of me, right? That'd probably be pretty scary, but like the idea is, is like. , animating still objects is a concept, and that's not novel to Sora. In fact, part of what Gemini's, announcement video talked about was doing similar things, um, you know, with with multimodality.

And Google also had another similar research release in terms of animation of images to video not too long ago as well that we briefly covered on the pod. So I think the concept, though, that people need to understand is the basic idea is the same. Is that you have a deep neural network that's trained on content, and in this case, the video creation, uh, it's just a different modality of output than what you're doing with language.

Um, so it's certainly a lot more sophisticated in the sense of the depth of data that it [00:09:00] has access to, since video is such a deeper source of information than text or even images.

Mallory Mejias: But you still think with this idea of patches that it's predicting the next patch, more or less.

Amith Nagarajan: That's my understanding. I don't know a whole lot about Sora beyond what was written on the, , the website. Nobody knows a whole lot about it. Um, but you know, the essence of the idea is very similar. They've built, a lot of really interesting engineering into the model, but the basic ideAIs very, very similar. So it doesn't necessarily represent like a massive leap forward in terms of true reasoning that we've talked about a lot on the podcast. Like, can the model like really look at itself and have like a chain of thought or a tree of a tree of decision making? It doesn't appear to be doing any of those things from what I've read.

Um, but you know, at the same time, even extending the concept of next word, next token or next word. Yeah. Video patch prediction seems to be quite, quite powerful. Does it take us all the way to whatever the finish line needs to be in terms of having true [00:10:00] utility from a tool like this? I don't know. But I suspect there will be applications of this tool that particularly like short form videos, particularly for marketing.

I think there's gonna be applications of this tool that will be pretty stunning fairly soon. So I think it's an exciting advancement from that perspective. But From what I've seen, it's essentially the same basic approach from a fundamentals of AI perspective.

Mallory Mejias: Okay. People have been able to create deepfakes for years, you could say, and so when reading about Sora, I was wondering how exactly AI generated deepfakes that we've seen in the past are different from something like Sora.

Amith Nagarajan: Well, I mean, it's, it's not that different conceptually. And the difference I think probably is cost and difficulty, um, as well as the quality of these things in terms of the output, you know, deep fakes have other than the ones that people have really invested a lot into typically for really obviously nefarious purposes.

Um, Have been difficult to create. You know, it required a degree of expertise and a lot of a lot of equipment. [00:11:00] But what's happening now with this AI is it's possible to create stuff like that? That you know, is quite convincing very easily. In fact, you know, we've talked about that in the past with AI

avatars like the Hagen tool that we've both worked with in the past. You can create an AI avatar of yourself. And then you can have that AI avatars say anything you wanted to say. Um, that's a very narrow example of video generation, right? Where it's a specific subject in a specific background, doing something simply like, you know, verbalizing text.

Um, but if you imagine what these things can do with a lot more power like Sora has, it could be much broader in its implications and much more convincing, much more realistic. I think this is a lot of positive applications in terms of a communication medium being able to For example, generate even potentially personalized videos to help people understand the subject to learn a profession.

There's some incredibly exciting aspects of this, but I'm also really scared of this technology. I think it could have wreck a lot of havoc in the [00:12:00] world. You know, we talked about the election coming up here in the U. S. Um, for the presidential election and you talk about, uh, what deep fakes could do in that kind of environment or even for local or regional, um, elected positions.

So it's, it's a scary thought to think of this being, you know, released to the world, you know, completely free of restriction. Now, open AI itself, of course, will likely have all sorts of safeguards on it. That doesn't mean that people wouldn't be able to get around them. But open AI is just one of, you know, literally hundreds of organizations that are doing similar things.

So you're going to see tech like this. Make its way into the world and we're going to have to figure out what that means because you can't really put it back in the bottle at this point.

Mallory Mejias: Ugh, yeah. That's a lot to think about. I'm thinking maybe they would put up safeguards around using certain people's images and things like that, but as you mentioned, if you can kind of upload a reference image, it seems like it would be pretty hard to control what reference images you uploaded and animated and whatnot.

Mm hmm.

Amith Nagarajan: Well, just like with language models, video models and image models will have, you know, basically uncensored, [00:13:00] unguarded open source equivalents. They might not be quite as good as the most, you know, compelling cutting edge models that are proprietary and perhaps locked up, at least for now, but they'll be good enough to do a lot of damage.

And so, you know, this is why a lot of times when we talk about AI and safe use of AI. Um, how critical it is to understand that really the only thing that can keep up with AI is AI and so you think about a regulatory framework. You think about legal frameworks. You think about all these other things that could help potentially contain this.

Those are all essentially things that are trying to catch up with the technology. But I believe that we need to use the technology to be better at detecting, AI generated videos. I don't know how to do that. But the idea is that If AI is used for offense and defense, you know, you have to have AI caliber tools in order to guard against these problems.

So, um, that's a whole area of research that's happening in the AI field right now that I think is really important. Um, it's beyond my expertise to [00:14:00] talk about how things like that would work in terms of AI video detection, for example. But you know, the point would be that I don't think that we really stand a chance to detect that stuff without really advanced AI. Uh, and clearly we're going to have a generation capabilities out there, whether we like it or not.

Mallory Mejias: Speaking of detection, I agree with you. I think we're very soon getting to a point where with the human eye, we will not be able to detect if something is in AI generated human versus a real human. And maybe we're even already there at this moment, but I found something in my research for this topic that I wanted to share with listeners, because I am always the first person to say, even when I look at our Hagen avatars Amithh, when I look at basically anything in that realm.

I always call it creepy. I'm like, this feels creepy, , this feels unsettling. And I realize there's actually a term for that, and it's called the uncanny valley. I don't know if you've heard of this term, Amithh, but it's the phenomenon whereby a computer generated figure or humanoid robot bearing a near identical resemblance to a human being arouses a sense of unease or revulsion [00:15:00] in the person viewing it.

I realize now I have definitely experienced this uncanny valley. Amithh, I'm curious, is this something you experience or do you not notice it as much?

Amith Nagarajan: You know, I experienced it probably most strongly with my own AI avatar and seeing me speak saying something I'd never said, right? In a theoretically controlled environment on , this tool called HeyGen. That really creeped me out. I did not like it at all. As much as I'm a proponent of the technology and an AI optimist, um, I definitely had that sense.

That's very strongly when I saw myself speaking, um, I think when I see like AI generated content of other people or not people, but like AI generated, you know, avatars, it definitely causes me to step back and look at it. I think I've never interacted with a humanoid robot that has any degree of true resemblance to a person, but like, I think that would absolutely invoke a lot of those feelings for most people.

Um. And actually, it's one of the reasons I think some of the early kind of, uh, humanoid robot firms that are out there putting products into the market right now actually [00:16:00] are intentionally making their forms non human like they're humanoid, but they're clearly not trying to make it look like a true person, right?

So I think there's some really interesting psychological things to be thinking about there. But yeah, 100%. I think that most people are going to have those experiences in the near future if they haven't already. Okay.

Mallory Mejias: Yep, I think it's especially interesting to think of tools like Sora and film creation in the future and surely this will help Animation be streamlined and we, you know, as humans enjoy animated movies I think most people do so I'm not so much concerned there But I think about Sora or similar tools creating films with AI Generated humans and just wondering will people like that?

Will people enjoy it? I don't know. I guess that's something to be determined

Amith Nagarajan: I don't, yeah, I don't know the answer to that either. I think it's, it's an area of exploration. We're all like it or not. We're all going to be running an experiment on our society in the coming years to figure out how people react to this stuff. I think on the one hand, you know, there's obviously the usual issues that come with an advancing [00:17:00] technology, like what happens to the people who did that work.

You know, for their jobs, right? That's a massive concern for writers, for actors, for, you know, people everywhere in that creative process, because these are things that, like, are coming kind of out of nowhere from those for most people's perspectives and could, uh, you know, really undermine a lot of a lot of industries.

The flip side of it is is that potentially, you know, whenever we talk about exponential growth and the resulting compression and cost, um, making it possible for people to create animated videos or live action videos, right? That would be out of reach up until now, where you say, Oh, well, wouldn't it be cool to have, like, you know, a really great, like, live action documentary about our company or something like that, or our association?

And that one might have cost a million dollars to produce 10 years ago, maybe in the recent years with advancements in technology. Maybe it's 200, 000 to do that, but it's still out of reach for most people to do even like a short documentary film about their organization or their company's history or their [00:18:00] association's history.

But now With this stuff, maybe you could create that, you know, very, very inexpensively. So the point would be that as cost decreases, demand tends to go up. And so the question would be is, are there alternatives, um, for the things that are kind of like lower on the quality level? Um, there's still Very useful for their purpose, but they're not at the level of like cinematic production or something like that But they're things that people could use in business That would still have displacing effects on a lot of people from from an employment perspective But the question would be is like is there enough output creation or output growth to offset that and I'm optimistic That over a long period of time the answer is yes But I'm very pessimistic about it from the perspective of the near term because I think that the change is happening so fast You know, people, societies, organizations aren't going to have time to react to it.

So I'm quite concerned about this, but at the same time being concerned about it doesn't mean you can do a whole lot about it other than, you know, really getting educated about what these tools are capable of doing.

Um, I was telling one of my kids [00:19:00] yesterday that, uh, we're talking about AI artwork, not video.

And I said, listen, you know, I, I agree with all the concerns cause my, my child was quite concerned about all these issues. Um, and I said, Listen, I agree with you about all those concerns. I have the same concerns, but I think that it's really important you learn how these tools work because that's the world that you are growing up into

Mallory Mejias: Absolutely. I think on the one hand it It is absolutely a threat to creative professions, but then on the other hand, like you mentioned, it democratizes access and allows people to create things they never would have. I mean, most people couldn't animate films easily. That was kind of an elite skill. Last question on this topic, Amithh.

We've talked about chatbots at length on this podcast, AI powered image generators, AI powered video generators. What do you think is coming next?

Amith Nagarajan: a lot. I mean, there's a lot. I think that part of what you're gonna see is these things coming together in multi modal. Models, which is always fun to say, and I know that when we talk about Gemini, we'll be touching on that. [00:20:00] But when we bring this all together, um, that opens up better understanding for these models.

I also think that the same conversation we had back in the fall about Q star and the general conversation there around models that can truly reason, um, is really the big leap forward across all modalities, whether it's image or text or video, the lack of true reasoning capability. Um, is a major limitation right now, and there's engineering methods to get around that if you're constructing a custom software product.

But for most people's use cases, there is a hard limitation on true reasoning. So when we solve for that, that's gonna have an explosive effect on the capabilities of all these models. But I think as it comes to these modalities of video or, you know, music generation or whatever, I believe that you're going to see more and more of that integrated into unified models.

Um, which doesn't mean that there won't be lots of models to work with, but the mainstream major models that you use every day, like the chat, GPTs so forth. Um, you're not going to think about whether you go to them for specific things, whether it's [00:21:00] text or image or whatever, it'll take anything in and it'll produce anything on the, uh, in terms of output.

Mallory Mejias: Okay. Well, moving on to topic two with the unveiling of Gemini 1. 5 Pro, Google is propelling this AI race forward, introducing a model that not only rivals, but surpasses current large language models in some efficiencies and capabilities. This next gen model boasts a mixture of experts architecture, enabling it to process queries with unprecedented speed by utilizing only the necessary parts.

Of its neural network, a standout feature is it's expanded context window capable of handling up to 1 million tokens, allowing it to analyze vast amounts of data from hours of video to extensive code bases with ease. Its ability to maintain high performance with the expanded context window is also notable.

Particularly, its success in the needle in a haystack evaluation, achieving a 99 percent success rate in locating specific text within data blocks of up to a million tokens, is groundbreaking. [00:22:00] This model also showcases advanced in context learning capabilities demonstrating proficiency in acquiring new skills from extensive prompts without additional fine tuning, such as translating a scarcely spoken language based solely on a grammar manual provided in the prompt.

Interestingly enough, Google announced its 1. 5 Pro model just a week after it announced Ultra 1. 0, which powers Gemini Advanced, the paid tier of its own chatbot. So I feel like, Amithh, we said Gemini a few times here, so I kind of just want to clarify this for myself and for listeners as well. Google Gemini, formerly known as BARD, can be thought of as the equivalent of chat GPT, and then 1.

5 Pro, or 1. 0 Ultra, could be thought of as GPT 4. Is that right?

Amith Nagarajan: I mean, in terms of capabilities, that seems to be the case. I mean, 1. 0 Ultra from all the benchmarks was, uh, when it was announced in December to be slightly better than GPT 4 in some areas and roughly equivalent in most respects. And 1. 5 [00:23:00] Pro, which is a smaller model, and that's an important thing to come back to in a second, Um, is also roughly equivalent in capability part of the benchmarks to GPT 4 and 1.

0 Ultra. So, it's really interesting on a number of levels, but yes, I mean, all these tools at this point, the Gemini 1. 5 Pro and 1. 0 Ultra and GPT 4 are kind of considered the best of the best at this point. They're in the same general league.

Mallory Mejias: I also want to give a little overview to listeners of the benchmark reports that I found online comparing GPT 4 Turbo to Gemini 1. 5 Pro. So, GPT 4 Turbo performs better in mathematical reasoning, code generation, and image understanding. Gemini 1. 5 Pro performs better in general reasoning and comprehension, video understanding, and audio processing.

Amith Nagarajan: Yeah, it's those are good points to make benchmarks like a lot of other things in life. You know, they only reflect, uh, really a narrow subset of reality in terms of how people use these tools. So, uh, one of the things I always tell [00:24:00] folks is, you know, if you even if you're someone who's Uh, just getting started with this, um, try Gemini and try chat GPT.

See which one you like better because the way you use it, you know, might be better or worse for your particular needs. Um, the benchmarks tell you quite a bit. Um, and they're interesting to look at in terms of trying to quantify something. That is somewhat subjective. So they basically the way the benchmarks work is they have very specific tests in the benchmark, kind of like a standardized test for people were like you take the ACT. And there's a certain set of questions. It's kind of a similar process that you run the model through a series of exercises and see what the outcomes are, if they're correct or not, essentially. Um, but yeah, Gemini, you know, the 1. 5 pro release. I think that to me, Okay, Um, is a groundbreaking release because of a couple things.

One is the context window that you described earlier, and it's radically improved ability to actually have this what's called in context learning, which basically means the model. [00:25:00] Fully utilizes what's in the context. The context window is essentially the amount of information you can pass to the model.

And, you know, when we started off with chat GPT back in the fall of 22, the model was very limited. I think it had, like, uh, 4000 token, roughly context window. That's all it had, which is very, very small. And these models have progressively had bigger and bigger and bigger context windows, um, including like chat.

GPT's latest version has like 128, 000 tokens for the highest tier of it. Um, and that's big. So it is big or in that a million tokens is obviously very large. And the token, by the way, for those that aren't familiar, is roughly equivalent to like a little bit less than a word.

So a million tokens is like 750, 000 words roughly. Um, and that's great. But the really exciting thing is that it actually understands the full context. So if you go to chat GPT and you put in like an entire book into its context window with 128, 000 tokens, you can [00:26:00] do that for a lot of books. And then you talk to about the book.

Um, it's going to forget a ton of the information these models have historically been good at remembering what's the beginning of the context and what's the end of the context. It's almost like a student who pays really good attention at the very beginning of the class and at the very end of the class, but kind of falls asleep in the middle.

That's the way these models have behaved for a while, which is clearly a limitation, because if you have this big context window, if it's not really paying attention to what's in the middle, that's a big problem, and that's the most important thing isn't the million tokens, although that's that's a big, big improvement.

And Google, by the way, talked about a 10 million context being something that they tested, which I think will probably come out pretty soon. But the really big thing is that it has really good attention span. It understands the entirety of the context. If you want to send it A really complex prompt. You could do that.

And if you have a long conversation, if you've ever experienced in chat GPT, where you might have a really long running conversation, but by the time you get like 10 2030 [00:27:00] messages down, it's kind of forgotten what you said earlier and a long context window solves for that. It also again here because it has good in context learning, meaning it has a good understanding of the entire context, It's like a really good student doesn't forget anything, right?

It's like superhuman capability to remember all that as it's processing your request. That's a really big deal. Um, that's going to affect not only your consumer use of chat GPT, but it's going to affect a lot of applications that are being developed. You know, for example, um, one of the products that our family of companies is building is called skip, which is an AI data assistant.

And for skip to work well, skip needs to be able to look across a vast array. Of data types in a database. Um, and so access to something like Gemini 1. 5 pro, um, allows you to do things from an engineering perspective that are greater in scale and scope that we could do with, with, uh, GPT for even, so it's, it's definitely an exciting development for that reason.

The only other thing I was going to say about it that I think is really important is that, [00:28:00] um, probably five or six episodes ago, we talked about Mistral, the French AI company that had released something called mixed role with an X. Mixed role was one of the earlier mixture of experts models, which Mallory mentioned that Gemini Pro is as well.

And what's important about that is think of it as a model that has a bunch of helpers in it. So it's a model that has 8 16 20 or more specialized sub models or helpers within it. And when you invoke the model by talking to it in chat GPT or or invoking it through an API, if you're a software developer, it makes a decision based on your request to activate.

Only a subset of those helpers or sub models. And that's really important because the traditional LLMs have actually activated their entire neural network, which is much bigger and the, that is both more costly energy intensive, but it's also slower. Uh, and so one of the reasons Gemini one [00:29:00] five pro is really impressive is its speed at, at the capability level that it has, it's quite fast at inference time, meaning at runtime when you're, when you're interacting with it.

So it's a really powerful release. It's very exciting.

Mallory Mejias: Do you think we're heading in the direction of all these models using a mixture of experts architecture?

Amith Nagarajan: I think mixture of experts will be the common pattern for just about every major model that you interact with. So the short answer is yes. I think mixture of experts is going to be exactly what people start using because it's had a significant impact on performance and quality.

Mallory Mejias: And then you also mentioned at the earlier part of this conversation that 1. 5 pro is a smaller model. Is that what you meant in terms of the mixture of experts architecture?

Amith Nagarajan: Well, no, 1. 5 pro is actually smaller overall. So even if you kind of look at it separately from the idea of mixture of experts, ultrAIs their largest model, and they'll have a 1. 5 ultra at some point, I'm sure, and it will be even more powerful. But , you know, what's happened is there's this things called scaling laws of AI. Where [00:30:00] The more data and the more compete, you throw out a model, the bigger the model gets, and the bigger the model gets, the more performance it is in terms of its capabilities, but also it becomes more expensive to run more, obviously, more expensive to train, more expensive to run and slower, typically to run.

Um, and so smaller models are really interesting because they're portable. They can be run in lots of different places, and they can also be. Uh, in many cases, they can be much faster. So a smaller model that's performant at the same level of their largest model that was just released is quite a stunning announcement.

And to your earlier point about, you know, their release cycle and what they're doing. I mean, Google seems to be like all over it this year. You know, we all knew that they were working hard at this all last year and they were behind and people were giving them a really hard time about it. But you got to remember, Google actually invented the transformer architecture.

Google invented a lot of the technologies that power modern AI beyond that, and so they're no slouch and they're coming at it hard. And so you're gonna see a lot more from them. In fact, after Gemini [00:31:00] one five pro was announced just, I think, in the last 24 hours, they announced open source models called Gemma, and these Gemma models are based on the same architecture is Gemini, but they're open source and they have a two billion and a seven billion parameter.

Model of Gemma, which,, according to their benchmarks on those releases, and we'll include this in the show notes, are performing at the level of not only Mixtral from the Mistral company, but also, uh, Llama2, which is, um, Meta's open source AI model. Uh, so really, really interesting to see what Google's doing and how fast things are moving.

Um, I think the application of this is, you know, as people are just like going along doing their daily work in an association or, or wherever, um, they might be wondering like, how does this apply to me? Um, the way I would describe my suggestion to most folks who are not software developers is to just try these things out.

Don't get stuck in a rut where he said, Hey, I've gotten onto the AI bandwagon. I learned how to use chat GPT. I use it every day for my work. Cool. That's awesome. And I applaud those efforts. But [00:32:00] don't stand still. Try out Gemini and see if Gemini can solve some of the problems that ChatGPT didn't solve for you, right?

And then also the other thing about this bias of having a one time failure leading you to believe Something can't be done. This happens to all of us, you know, you do something you try it out and you fail, right? And you go, oh man, that didn't work. Whatever it is, right? And then oftentimes you just never go and try that again And it's hard to not do that exact thing because once you failed at something, you don't really want to experience that again with AI models changing so fast.

Let's say you had some fairly complex thing you were trying to do with chat GPT three months ago. Try it again in chat GPT for sure, but also try it in Gemini. Um, and if you're a software developer, then, you know, you just have yet another tool in your tool belt now that can do things that, you know, GPT four cannot do in terms of that context window.

So I think it's important to, you know, keep rethinking problems. Um, and then the flip side of that is think about [00:33:00] problems you cannot yet solve and catalog them. Start thinking about the things you'd like to do With AI, but you don't think AI is up to task on just yet and keep an inventory of those, like keep a shared Google Doc or spreadsheet.

Um, and then go back to that list, like have a list of 5 10 20 100 things that you don't believe that I can do. And whenever you hear these kinds of announcements, go test them out, see if they work, because you'll find that an increasing number of those things

Mallory Mejias: I will let all of you listeners know, I will be one of the first people to try out Gemini 1. 5 Pro when it's released to the public. Because honestly, sometimes ChatGBT drives me nuts. Like with what you said, Amithh, in terms of having a really long thread. Which most of my threads with ChatGBT are really long.

Because they're for ongoing projects like this podcast or blogs or things that we're doing at Sidecar. And it absolutely will forget. Things that I've told it, you know, 15 messages ago, 30 messages ago. So I'm intrigued by this whole idea of the, the needle in a haystack test, which by the way, I looked [00:34:00] up Amithh.

Do you know kind of how this works when they're testing the models?

Amith Nagarajan: No, I'm not familiar with that.

Mallory Mejias: So it seems like the first time they ran this was to test GPT 4 against CLAWD 2. 1. And they gave the models, I don't know exactly how many tokens, but lots of tokens essentially. Um, and they were essays of varying lengths back to back to back, um, about different topics.

And then buried within that at different levels. Sometimes within the first 25 percent or the middle 50 percent of those essays or the bottom, um, the bottom 25%. They put a line in there that said something along the lines of, the best thing to do in San Francisco is go to Dolores Park and eat a sandwich in the sun.

And that one sentence was buried somewhere within there. And then they would ask the model, um, give it all of those essays, and then say, what's the best thing to do in San Francisco? And see if it hallucinated or if it referenced, uh, that text. So hearing that, potentially, Gemini 1. 5 Pro has a 99 percent success rate.

With that kind of thing, I'm really excited about that.

Amith Nagarajan: Yeah, I think that's going to be really [00:35:00] powerful. I mean, the, the application of that might be if you're thinking about like, Oh, you're, you're working with chat GPT or Gemini to create a blog post and you're going back and forth. And maybe the first thing you're doing is outline. And then after that you're going back and forth and iterating on it.

By the time you get 5, 10, 15 messages deep, it remembers like the most recent two or three interactions, but it forgot the outline that it started off with. And, you know, that's the limitation that Mallory's describing that, uh, Gemini One 5 Pro should be able to overcome handily. So that's super exciting because, you know, again, these assistants that we're using, we're getting accustomed to using in our lives, they're so remarkable in what they can do for us.

So we're all excited about it, but at the same time, they have these limitations that almost seem kind of comical. Uh, in terms of their forgetfulness and other factors that were very quickly overcoming.

Mallory Mejias: And finally today, we want to wrap up with a discussion around generative artificial intelligence and business. For a deep dive into the practical application of generative AI in businesses, the Wall Street Journal's CIO Journal conversed with [00:36:00] three industry leaders, Sesh Iyer of Boston Consulting Group, Satheesh Muthukrishnan of Ally Financial, and Fletcher Previn of Cisco.

These discussions shed light on how different organizations are leveraging AI to enhance it. efficiency, productivity and innovation across various functions. So we pulled out a few key insights from that conversation, and we want to do a little bit of discussion around those. Um, one of the first insights was to deploy existing software tools with generative AI features in them to see a 10 to 20 percent productivity lift in day to day work.

Another was to use generative AI to summarize customer care calls, driving consistency in training and allowing customer care associates to focus more on customers. Next, using generative AI to follow up with potential employment candidates. And then the next use case is a little interesting. A broader impact of 30 to 50 percent productivity boost is feasible when AI is fundamentally integrated to reshape key [00:37:00] business areas like software development, marketing, and individual business specializations.

The next insight was to re evaluate and consolidate software contracts to eliminate redundancy and better utilize budgets. This is something we talk about within the Blue Cypress family of companies all the time, seeing if we have all these different people using a tool, maybe it's better to go with a team's license or an enterprise license in that case.

Create detailed plans that outline data usage, customer impact, and value generation before seeking budget approval for AI projects. And this last one we talk about a lot on this podcast. Engaging early in AI technology exploration is crucial. Find a balance between the immediate budgetary constraints.

and the significant long term potential of AI to transform operational efficiencies and competitive positioning. Previn, the Chief Information Officer at Cisco, says, and I'm paraphrasing, don't miss out on AI because you couldn't close your Q4 budget. So, Amithh, which of [00:38:00] these insights would you say kind of resonates with you the most?

I have an idea, but I'll ask you first.

Amith Nagarajan: Well, I think the whole article was great, and we'll include it in our show notes as well, so interested readers can dive deeper into the source article itself. But I actually think the last point that had to do with budget was the thing that struck me the most, that I think is really important. Part of what I share with leaders in the association and non profit market when I speak to them, But I investment is that you should look at what you could stop doing outside of AI too, because it might be that you don't have any more money to spend or more time to spend on.

AI initiatives compared to what you already have on deck. But that is something to step back and then ask yourself, well, could we stop or pause some of the projects that we have right now that may not be as important in the world of AI? And my favorite example of that in the association world is an AMS implementation.

So if you're in the process right now, selecting or implement. Um, you know, those things take a long [00:39:00] time. They're very, very expensive. They're also really slow and usually. A new AMS. Will only have a modest increase in efficiency on the organization. It's usually, you know, 10 percent 30 percent improvement, you know, and those are just numbers I'm making up.

They're not based on on data. But in my experience, having been in that world for a long time, you know, you tend to see improvement, but it's not radical, and it takes a lot of time and a lot of energy, you know, and an AMS. Implementation might take a year or two years to go through and possibly longer if you include the selection cycle.

So, you know, rather than proceeding with that, perhaps you hit the pause button there and say, Well, yeah, maybe we need to do that at some point, but perhaps we can hit the pause button and invest that energy and those some of those dollars into exploring AI experiments. Um, and so there's always creative ways to think about things if you're willing to compromise on what you say your priorities are.

So most people have creative A significant amount of energy and resourcing, but they already have allocated those resources to existing [00:40:00] things. And I'd ask people to say, given how important I is and how transformative it is, wouldn't it be wise to potentially hit pause on something that wouldn't affect your business that much if it was delayed 36 or even 12 months and invest that energy and perhaps some of those dollars in some experiments.

So that to me was one of the main takeaway points I had from that. The only thing I'd say is, like, even the first point you made about existing software tools with added genetic features, that's not even, you know, a full integration that you're describing later. 10 20 percent productivity lift is meaningful.

I mean, even if it's a small thing, you know, 10 to 20 percent productivity lift from a very limited effort. is actually tremendous. You know, if we think about that order of magnitude of improvement in productivity for something that's very inexpensive or even free in some cases is stunning. I mean that you don't see that every day.

So, you know, looking for those simple, easy and low low return things, but they're also low investment [00:41:00] could be a good way to get started for a lot of people.

Mallory Mejias: For the record, I did think that the last insight, the one that I read was going to be the one that resonated with you the most. So I called that. In terms of this productivity boost, Amithh, I am curious because you are an analytical person for sure. What percent productivity boost have you seen kind of in the past year with your work using AI?

If you had to guess.

Amith Nagarajan: You know, it depends on the type of thing that I'm working on. So If I'm doing any type of writing of any significance, it's probably like a 10 X increase in performance. So, you know, if I have an idea like the choke point for me in getting my ideas into, you know, something someone else can consume, you know, is the process of typing it in, editing it, all that stuff.

I have plenty of ideas. Um, but I now have a partner who can work with me at essentially light speed to turn raw ideas into well thought out You know, written paragraphs. And so to me, that's enormous. I mean, even the Ascend book, we talk about this a [00:42:00] lot. Um, we used AI heavily, heavily to create the Ascend book in a very short period of time.

The ideas were all original. We had all these ideas for how AI should be used. And we did go to AI as a brainstorming partner for a lot of the, a lot of the details, but we heavily used AI to craft that book. Um, you know, we have articles on our website that we use heavily, use AI to help us craft them. Um, and so for me, you know, it's probably like in that category, at least the doubling of productivity, probably more like 4 to 5 X increase, uh, as opposed to measuring the percentage.

So it's like, what would that be? 80 90 percent improvement. Um, and in the area of another area that I spent a lot of time in is working with technical folks across our family. Doing software architecture, sometimes even diving into the code myself and doing some software development, which is like a lifelong passion.

I've had since I was a kid. I'll never give that up. And I would say that with the tools that we have now in software development and software design, there's a similar multiplier effect, at least the doubling in productivity for people who are [00:43:00] using things like copilot and others within the coding environment.

Um, it's, it's just amazing. So you end up, if, if you're willing to fully embrace these tools and learn how to use them, and it's taken me a while to figure out how to use these tools, obviously, um, you get a very large productivity increase.

Mallory Mejias: Mm hmm. In this article, they mention a distinction between that general productivity boost with AI, as opposed to reshaping critical business functions using AI. Can you talk a little bit about that difference?

Amith Nagarajan: Well, I mean, when you're talking about specific use cases versus general purpose application capabilities, you know, that's where like the general purpose thing is going to affect basically everything. So if you can like use a tool to help you write. Faster. Like you take a sentence and turn that into a paragraph that's generally useful, but it's non specific, so it may get things wrong a lot.

You know, it might be very, very good at things that are very broad in nature, but when you drill down that they're not as good. Um, and [00:44:00] then, like, if you think about particular business processes that you want to add automation to, let's say, for example, we want to do a marketing campaign and our goal is to improve member retention.

And so we have this idea where we want to improve member retention, and we think that if we have more segments in our audience that will be able to do a better job of communicating the value of membership, and then through that will have a higher renewal rate, right? That's a common hypothesis where people get stuck.

A lot of times is, first of all, determining the customer segments or even doing truly personalized email, right? And then, of course, executing On those segments saying, Hey, we're gonna develop truly personalized content because that's a lot of work. So now, you know, if you were to say, well, we're gonna develop an AI. Use case specific to that need. Um, AI can help you craft the segments can help you essentially analyze your data and say, Oh, well, these might be segments that could make sense based on the data that I see. Um, and there's a lot of AI that can help with that. And then, of course, crafting the actual individual [00:45:00] messages is another thing that you can do.

And if you package that all into an email marketing software tool that is able to help you do segmentation, do copywriting, and then do a B testing on different variations of copy, and it's in one package that's specifically built for that type of purpose, that's where you start seeing massive increases, both in productivity, which we're talking about, but even more importantly than that, the outcome, you know, because if I can just throw more labor at it, I can say, well, let's do that personalized If I'm willing to spend the money to hire more people, cool, we can do that.

Um, but the quality can be even higher using AI and automation because you can go faster and have faster iteration. Uh, particularly with AB testing, what you're starting to see are marketing tools. That are able to not only a B test, but they're able to essentially a B test with an infinite number of potential permutations, um, and to take feedback from the actual signal results, essentially the clicks and opens and the conversions you're getting through the marketing funnel and actually craft [00:46:00] new versions of copy on the fly.

Now, would you, as the marketer, be willing to let the AI go in full autopilot like that and do it? Maybe not, but it might be like a copilot scenario where it's like, Hey, Mallory, this campaign you're running for digital. Now, um, we found that this particular variation works really well, but we want to try these three others that are derivations of that and automatically suggest it.

And you just say, Yeah, that sounds good. Click. Okay. And then the tool does everything else for you, right? That's where you start seeing both massive productivity increases, but also you see outcomes improve, which is even more important. Yeah.

Mallory Mejias: That makes sense. So we at this point, though, do not have tools like the one you just described, or we do

Amith Nagarajan: If I, if I ran a company like a hub spot or someone else that had a, you know, a broad based marketing tool, I'd be working on all that right now. And I'm sure these people are what I'm describing is not rocket science to people in marketing automation software companies. So you're going to see this, you know, Cambrian explosion, so to speak of tools and integration capabilities within [00:47:00] these, these well used software tools.

So if you're using A mainstream software vendor for email marketing, like a hub spot, um, or a sales force or something like that. You're gonna see capabilities like that pop in very soon. I have no doubt about it, but it's this full integration. Really, the point is, is this this full integration of the of the business process, right?

As opposed to, like, I use a general purpose tool for each individual part. I can use one tool to create my segments. I can use another tool to create copy. I can use another tool to evaluate the effectiveness of each of the branches of my A B test. I can use yet another tool to take all that data and then Create an extra vision of the campaign.

But what if it was all built together into one unified process, right? And that's where you start to see some significant increases. Uh, marketing was an easy low hanging fruit example for me to describe in the pod. Another one I threw out there that's more association specific might be around the certification process.

Um, a lot of certification processes require manual intervention where, for example, when people are uploading evidence of time in the profession, that's a common requirement. For [00:48:00] professional certification, where people have to prove that they've worked a certain number of years or a certain number of hours in the profession or taken certain tests which came from third parties that they don't have digital records of and blah, blah, blah, all these things, they require a person on the association staff to manually review and approve or reject the content that was provided.

That's a good example of yes, the raw tools are capable of doing those things. But if you weave them together into unified process, you can nearly fully automate a lot of those things. And that's a completely different level of automation and efficiency.

Mallory Mejias: I know we promised we would talk more about Microsoft Copilot on this podcast, and I'm sure soon we'll probably have a whole segment or topic dedicated to it. But I wanted to share with listeners one of the features of Microsoft Copilot that I've been getting a lot of use out of is on my meetings. I turn on Copilot.

Which just turns on the transcription for the meeting and then afterwards you have Like a chat interface a chat bot that you can interact with and ask questions about your meetings to say Oh, what did it mean [00:49:00] say about this one website thing that we need to make an update on and I've had pretty good Success with it so far.

I mean, I don't know if you've used that feature at all

Amith Nagarajan: I have, yeah, and, you know, generating suggested action items out of the meeting as well is really powerful, so you don't drop the ball on anything. Uh, I think that'll find itself further into the workflow over time, where you take the natural extension of that and say, Okay, well, these are the action items that Mallory and Amithh and whoever else was in the meeting agreed to.

Then it will follow up and make sure you're doing those things and even suggest helping you with those things. So, you know, co pilot is, even though it's integrated in Microsoft quite nicely into the office suite, it's still version one, even version 0. 9. You could call it, um, where it's going to go. It's part of what Microsoft's publicly said is their vision is that the co pilot is more unified, so it really understands you throughout your application experience.

So the co pilot and word and power point and in teams is really the same co pilot. It knows you so well. Part of what we talked about earlier with advancements and models like Gemini, having much broader [00:50:00] context, Microsoft and others will be able to power application features like that, uh, you know, far more fluidly because they'll have models that will support much bigger context, right?

And that context, just think of it as it's working memory. And if an individual that you're working with has very poor working memory, it's hard for them to be very effective helping you. Similarly, these models have been very much like you know, task specific in your use, and then they kind of stopped working.

But if they had like full understanding of everything you've done for the last 30 days in your work, that kind of model could be stunningly powerful.

Mallory Mejias: I'll admit, I still find myself, at least at this point, going out to ChatGPT and continuing my threads where ChatGPT forgets what I said 15 messages ago, but I look forward to the co pilot that you were describing for sure.

Amith Nagarajan: Well, the interesting thing about all this stuff is, um, first of all, people get excited about truly revolutionary products like , language models and chat GPT, and then they get used to them very quickly because they're like, Oh, well, of course I have that and how come they're so bad? Right?

And [00:51:00] then the next generation and the next iteration comes along and very quickly the bar gets raised. And if you look back in time, even to 12 months ago, and you say, what was the AI that you were using 12 months ago? It was complete garbage compared to what you're using today. Right? And the same thing will be true 12 months from now, or even six months from now.

And we look back and say, yeah, you put up with the thing forgetting half of what you said. How could you possibly have used something that, that low quality? But then the reality is if you take it completely away and say, how do you get your job done without chat GPT? Like, that's an interesting thought experiment.

I wouldn't, like, wish it upon you. But, like, if you were to say, like, you're not going to use AI for even a day, what does life feel like? You know, write an essay about that if you're used to using it. It would be rough. Um, and so it's the same thing like if I don't think you can do this, but if you could go back and use an earlier version of chap GPT from the time machine and experience that, you know, you realize how far it's come.

So I think it's interesting perspective to reflect on from time to time.

Mallory Mejias: You're right. You're definitely right. Fall of 22 was when Thomas Altman from Tasio [00:52:00] showed me the GPT 3 developer playground and it blew my mind. And I was like, nothing will beat this. This is the craziest thing I've ever seen. And now I'm like, oh gosh, chat GPT just needs to be better. We tend to forget.

Um, that would be an interesting article to write of not using AI for the week. I don't know if I'm ready for it yet, Amithh, but maybe, maybe when things slow down, possibly.

Amith Nagarajan: Yeah, maybe someone will volunteer. One of our listeners will volunteer to do that and write an article about it. I don't think I could put up with that. I'm an extremely impatient person, and AI has helped me fuel that, uh, that flaw. , but it's, it's one of those things that I think perspective is helpful.

And it also is, it's not only helpful from a reflection perspective, but it's also helpful from a forecasting perspective to think, okay, well, that's where we were 12 months ago. What potentially could be the case 12 months from now. And again, just goes back to the point I made earlier about, um, You know, just because it can't do something that you want it to do right now, catalog that so you don't forget the idea and come back to it in three months, six months or 12 months time and try it again,

Mallory Mejias: It will be very interesting to have all these episodes to look back on [00:53:00] and have AI analyze and see what we used to be concerned about, um, in previous months and previous years.

Amith Nagarajan: that'd be a good episode. Mallory, we should run AI on all of our podcasts. Once we get to like podcast episode 30 or 40 and have it run that analysis for us.

Mallory Mejias: yeah, let's do it. We're putting it in, in, in the world right now. Um, I know that this is a conversation we will continue to have, uh, generative AI in business, and I really look forward to seeing kind of what advancements we see within the next few months and the next few years. Amithh, thank you so much for your insights today.

Amith Nagarajan: Thanks very much.

Mallory Mejias: We'll see y'all next week.

Amith: Thanks for tuning into Sidecar Sync this week. Looking to dive deeper? Download your free copy of our new book, Ascend, Unlocking the Power of AI for Associations, at ascendbook. org. It's packed with insights to power your association's journey with AI. And remember, Sidecar is here with more resources, from webinars to boot camps, to help you stay ahead in the association world.

We'll catch you in [00:54:00] the next episode. Until then, keep learning, keep growing, and keep disrupting.

View full post