Skip to content

Commit c4ee8a8

Browse files
authored
Merge pull request #1 from MissMeg/MissMeg-changelog-330-patch
Cleared some unintelligible sections
2 parents 5f3b176 + 1612209 commit c4ee8a8

File tree

1 file changed

+5
-5
lines changed

1 file changed

+5
-5
lines changed

podcast/the-changelog-330.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -78,15 +78,15 @@ That is super-useful, because basically the whole idea is that it brings visibil
7878

7979
We keep on putting more and more stuff inside of Git repositories, and what we're trying to do is "Sure, that's great, but now let's analyze it." Let's use that data we've put in there to try to understand what's going on.
8080

81-
The cool thing about being SQL -- because I was, and I'm still thinking about offering a GraphQL thing, because Git repositories are trees, and once you parse code, you get a tree; so everything's trees, and GraphQL for trees is great... But the fact that it's SQL, it allows you to mix it with other datasets. You have Looker, or Power BI, or things like this where you can have many datasets and do a query across many different databases. Imagine doing something where you're saying -- say inner source; \[unintelligible 00:14:37.09\] will go under inner source is really make sure you break the silos in a company and that everybody collaborates with each other. \[unintelligible 00:14:45.26\] In order to measure how well you're doing, the whole idea is that you need to first know who is in each team. Unfortunately, that is not in your Git dataset, so you're gonna need to mix it with some other dataset - HR dataset, or whatever it is. So Looker, or Power BI, and I think that even Tableau - they will allow you to do these kinds of things...
81+
The cool thing about being SQL -- because I was, and I'm still thinking about offering a GraphQL thing, because Git repositories are trees, and once you parse code, you get a tree; so everything's trees, and GraphQL for trees is great... But the fact that it's SQL, it allows you to mix it with other datasets. You have Looker, or Power BI, or things like this where you can have many datasets and do a query across many different databases. Imagine doing something where you're saying -- say inner source; The whole goal under inner source is really make sure you break the silos in a company and that everybody collaborates with each other. Like the Google-style, Facebook-style, even though the inner source term \[unintelligible 00:14:51.00\] at PayPal. In order to measure how well you're doing, the whole idea is that you need to first know who is in each team. Unfortunately, that is not in your Git dataset, so you're gonna need to mix it with some other dataset - HR dataset, or whatever it is. So Looker, or Power BI, and I think that even Tableau - they will allow you to do these kinds of things...
8282

8383
**Adam Stacoviak:** You can look up the repo URL on GitHub, even if it's a private GitHub repo as well.
8484

8585
**Francesc Campoy:** Yeah, yeah. So the cool thing is that--
8686

8787
**Adam Stacoviak:** Because you have teams at the org level, so you could look up not so much by the repo, but by the repo URL.
8888

89-
**Francesc Campoy:** Yeah, yeah. So the thing is that all of that -- that is the GitHub API. We work with any Git repository, so all of the concepts that we work with are Git, for now... That's why, for instance, the organization is a GitHub or GitLab \[unintelligible 00:15:39.12\] you could expose it from a different dataset. Just download the whole thing, put it in a MySQL and that's it. You can do that, too.
89+
**Francesc Campoy:** Yeah, yeah. So the thing is that all of that -- that is the GitHub API. We work with any Git repository, so all of the concepts that we work with are Git, for now... That's why, for instance, the organization is a GitHub or GitLab. Point you could expose it from a different dataset. Just download the whole thing, put it in a MySQL and that's it. You can do that, too.
9090

9191
**Adam Stacoviak:** Right.
9292

@@ -128,7 +128,7 @@ For data analysts - if you tell a data analyst "Oh yeah, you should use Git in o
128128

129129
The other change that lots of banks want to do is going back to the inner sourcing. Large banks, they have many IT groups all around the organization, and they want them to work together well, and the first piece is to figure out who is doing what, what resembles to what, how much code duplication you have... We have a thing that analyzes code duplication not character by character, but rather extracting the abstract syntax tree, modifying a couple things, so it's actually a very smart way of figuring out whether two pieces of code are very similar. They're so similar that if you saw them next to each other, you would say "You need to refactor them and just write one function." We're able to detect this automatically, and this helps a lot, because if you imagine you're like the CTO of a bank, and you have this codebase that dates from the '60s, and they tell you "Please put it on the cloud." That's hard. That is a harsh thing to ask from anyone.
130130

131-
The idea of being able to tell them, "Well, actually all of this source code - let's see which parts are gonna be the easiest ones, \[unintelligible 00:21:58.18\] modernizing traditional applications", which is not really cloud-native, but we can make it be cloud-native, we can make it run in Kubernetes... And then what is the COBOL that -- you know, that's gonna be an interesting challenge to migrate. So having a view of all of this by just running a couple queries is really powerful. The other option is literally running -- I'm gonna be very hopeful and say that you could run a really huge Bash\[unintelligible 00:22:25.25\] calling Git very often, and maybe you will get something similar... But it would take hours instead of seconds.
131+
The idea of being able to tell them, "Well, actually all of this source code - let's see which parts are gonna be the easiest ones, like this MTA, "modernizing traditional applications", which is not really cloud-native, but we can make it be cloud-native, we can make it run in Kubernetes... And then what is the COBOL that -- you know, that's gonna be an interesting challenge to migrate. So having a view of all of this by just running a couple queries is really powerful. The other option is literally running -- I'm gonna be very hopeful and say that you could run a really huge Bash crib calling Git very often, and maybe you will get something similar... But it would take hours instead of seconds.
132132

133133
**Adam Stacoviak:** Yeah. I think the thing I'm trying to drive out here is that clearly you can pull back a lot of intelligence if you know what you're looking for. It seems like maybe some consulting is involved there, or at least the right kind of teams in place to know how to ask those questions of bigger data analysis for your data analyst, for example...
134134

@@ -164,7 +164,7 @@ The idea of being able to tell them, "Well, actually all of this source code - l
164164

165165
**Adam Stacoviak:** Even a better interface...
166166

167-
**Francesc Campoy:** ...instead of doing "Find function names", it's gonna be "fun.\* " and then something that starts with the letter, whatever... That's a pain to write. And also, what if now you don't have a Go function, but you have a Go method; actually, that will not work anymore, right? So what we're doing is instead allowing you to extract the tokens that you care about. We work with this concept that we call "universal abstract syntax trees", and the whole idea is that it's \[unintelligible 00:25:03.03\] so the result of parsing a program, but it allows you to extract things by using annotations, and those annotations are universal, right? Say, a function -- a function is a function, no matter what programming language you have, right? An identifier, same thing; strings, same thing. So if you want to extract the function names, what you need to do is basically use the \[unintelligible 00:25:25.05\] function, you pass the content, you pass what language you want to use, and then you just pass something that \[unintelligible 00:25:30.02\] that basically says the function names.
167+
**Francesc Campoy:** ...instead of doing "Find function names", it's gonna be "fun.\* " and then something that starts with the letter, whatever... That's a pain to write. And also, what if now you don't have a Go function, but you have a Go method; actually, that will not work anymore, right? So what we're doing is instead allowing you to extract the tokens that you care about. We work with this concept that we call "universal abstract syntax trees", and the whole idea is that it's an abstract syntax tree so the result of parsing a program, but it allows you to extract things by using annotations, and those annotations are universal, right? Say, a function -- a function is a function, no matter what programming language you have, right? An identifier, same thing; strings, same thing. So if you want to extract the function names, what you need to do is basically use the UAST function, you pass the content, you pass what language you want to use, and then you just pass something that, it's an \[unintelligible 00:25:35.00\] thing that basically says the function names.
168168

169169
**Adam Stacoviak:** Right.
170170

@@ -254,7 +254,7 @@ So now the cool thing is that you can start by doing things like, you know, coun
254254

255255
For many people - the people that really care about deep analysis of large codebases - they tend to also not want to share their source code. So for that it doesn't make that much sense to have a SaaS for the engine.
256256

257-
**Adam Stacoviak:** It makes sense. So if folks \[unintelligible 00:37:54.28\] what can they expect? That's what I was trying to do - tee up the fact that it's sort of an early release; maybe you're even looking for feedback.
257+
**Adam Stacoviak:** It makes sense. So if folks sign up for the beta what can they expect? That's what I was trying to do - tee up the fact that it's sort of an early release; maybe you're even looking for feedback.
258258

259259
**Francesc Campoy:** Yeah. So that's the whole point - we are trying to get people to use the product, file issues, let us know what they think... File issues for things that do not work, but also for things that they would like to do. This is a pretty young project; we released it two months ago, something like that, so it's pretty early on... The idea is that we're gonna be working with really large companies to try to make it as good as possible, but at the same time we also want to have the input from the community, because they have different needs.
260260

0 commit comments

Comments
 (0)