Jump to content

Did Microsoft just revolutionize web search in 2023? 🤯


chef

Recommended Posts

I just went to the Chat OpenAI website, signed up and run the queries from there. It's certainly a different way to make queries, takes a little getting used to.. Rather than asking questions/queries and being presented with a lot of links, you now get an easy to follow answer, without having to delve into links and websites.

Link to comment
Share on other sites

8 hours ago, kaj said:

I just went to the Chat OpenAI website, signed up and run the queries from there. It's certainly a different way to make queries, takes a little getting used to.. Rather than asking questions/queries and being presented with a lot of links, you now get an easy to follow answer, without having to delve into links and websites.

Yea, the answers you can get from ChatGPT are definitely amazing and ground-breaking. It's a huge step further form GPT-3 where the results are much more random. The fact that it takes all the previous questions and answers (from a single conversation) into account makes a huge difference for once, but they have also made incredible achievements with regards to the accuracy and truthfulness of results. 

When it comes to programming questions, the real power is not the first answer you get; usually you can get the same results with normal web search easily as well. But then you can keep asking and set up conditions and requirements to get something you actually need - and for those things, you would often need to scan through dozens of search results to find what you're actually looking for. Sometimes, you need to be a bit insisting. From my experience I would recommend to always check back and ask things like "Are there other ways?", "Is the best way with regards to performance?", "Is this code compatible with ...?" etc.

It is also very good for things you can't search for, like for example "Can you explain the following regular expression? ....." => "Can you rewrite it in a way that it also matches xyz?"

And while I was writing this, I picked an arbitrary regex from Emby's matching and asked:

Quote

Can you explain the following regular expression?
(?i)[-._ ]+S(?:eason ?)?(\d{1,3})(?:[ _.-]*(?:ep?[ .]?)?(\d{1,3})(?:[_ ]?v\d+)?)+(?=\b|_)[^])}]*?(?:[[({][^])}]+[])}][ _.-]*)*?(?:[[({][\da-f]{8}[])}])

And the answer was one of those WTF moments....

Quote

This is a complex regular expression used to match strings that represent TV show episodes.

Damn, how did it come to that conclusion?

And the rest of the answer didn't just explain the regular expression technically, it even explained it in the context of matching tv episode file names:

Quote

Here is a breakdown of the components of the expression:

(?i) - The (?i) flag is used to make the regular expression case-insensitive.

[-._ ]+ - This matches one or more of the characters '-', '.', '_', or ' '. This is used to separate the components of the episode identifier.

S(?:eason ?)?(\d{1,3}) - This matches the string "S" followed by an optional "eason " and a number with 1 to 3 digits. This number represents the season of the TV show.

(?:[ _.-]*(?:ep?[ .]?)?(\d{1,3})) - This matches an optional episode number with 1 to 3 digits. The episode number is preceded by an optional "ep" or "e" separated by a space or dot, and optionally surrounded by spaces, underscores, dots, or hyphens.

(?:[_ ]?v\d+)?)+ - This matches an optional version number with one or more digits, preceded by an optional space or underscore and the letter "v".

(?=\b|)[^])}]*? - This matches zero or more characters that are not ")", "]", or "}". The (?=\b|) lookahead ensures that the match is followed by a word boundary or underscore.

(?:[[({][^])}]+[])}][ _.-])? - This matches an optional string surrounded by square brackets, parentheses, or curly braces, but does not include the enclosing characters. The string can contain any character except for ")", "]", or "}". The match is optionally followed by spaces, underscores, dots, or hyphens.

(?:[[({][\da-f]{8}[])}]) - This matches an optional string surrounded by square brackets, parentheses, or curly braces, and consists of 8 hexadecimal characters.

This is crazy!

Link to comment
Share on other sites

Cheesegeezer

Me and regex = 🤯🤬🤬

although there some good sites that help you and can test your expression and these are priceless to me if i need to use it. 
 

but AI coding is getting pretty serious stuff now. Is it good, i don’t know i had copilot installed before they made it paid, and i got more annoyed with as it was becoming less useful to me and just regurgitating old code that wasn’t even relevant. So more often than not i had it disabled, and based on my experience i chose not to pay for it and continue using it. 
 

 

Link to comment
Share on other sites

4 minutes ago, Cheesegeezer said:

Me and regex = 🤯🤬🤬

LOL - I hate regex so much that my brain refuses to memorize it. Each time I need it, I need to learn it from scratch and one day later it's gone again. 🙂 

24 minutes ago, Cheesegeezer said:

but AI coding is getting pretty serious stuff now. Is it good, i don’t know i had copilot installed before they made it paid, and i got more annoyed with as it was becoming less useful to me and just regurgitating old code that wasn’t even relevant. So more often than not i had it disabled, and based on my experience i chose not to pay for it and continue using it. 

I haven't tried Copilot but I haven't read a lot of good things about it. I have MS IntelliCode enabled and sometimes it's nice as it avoids typing.

But this and ChatGPT are two different things. The former is about guessing your next line (IntelliCode) or lines (Copilot) from what you have written so far. The problem here is that the AI doesn't know what you want to do, it can just suggest what would be the most likely continuation. It suggest that continuation and you can take it or not.

With ChapGPT, you can have a conversation. It's more like asking a friend and even when it generates sample code for illustration, it's not exactly something to copy/paste. But the main difference is that you can tell what you want rather than letting the AI guess from partial code you have written.

  • Agree 1
Link to comment
Share on other sites

15 hours ago, arrbee99 said:

Yeah, there will always be bad actors. Creating nerve agents goes against the Geneva Convention, but it doesn't mean that it doesn't happen. Unfortunately.

However, it is important to also note that the scientists in the article were originally trying to use ML for good, to help combat bacterial infections in humans.

Obviously, if you flip the expectations around, you can do bad things.

It's like owning an deadly weapon. Or having a zero day malware attack. 

You either do nothing with it, or you do something with it.

Hopefully, the "something" you do is honest and with integrity. Not diabolical.

🤷

Link to comment
Share on other sites

10 hours ago, kaj said:

I just went to the Chat OpenAI website, signed up and run the queries from there. It's certainly a different way to make queries, takes a little getting used to.. Rather than asking questions/queries and being presented with a lot of links, you now get an easy to follow answer, without having to delve into links and websites.

I wonder how long those websites will last if the new search paradigm is to harvest their data and hide their existence?

Link to comment
Share on other sites

I see a whole new line of SEO coming in order to work with these new AI engines.  Someone is quickly going to figure out how to game this system.

  • Agree 1
Link to comment
Share on other sites

1 hour ago, softworkz said:

LOL - I hate regex so much that my brain refuses to memorize it. Each time I need it, I need to learn it from scratch and one day later it's gone again. 🙂 

I haven't tried Copilot but I haven't read a lot of good things about it. I have MS IntelliCode enabled and sometimes it's nice as it avoids typing.

But this and ChatGPT are two different things. The former is about guessing your next line (IntelliCode) or lines (Copilot) from what you have written so far. The problem here is that the AI doesn't know what you want to do, it can just suggest what would be the most likely continuation. It suggest that continuation and you can take it or not.

With ChapGPT, you can have a conversation. It's more like asking a friend and even when it generates sample code for illustration, it's not exactly something to copy/paste. But the main difference is that you can tell what you want rather than letting the AI guess from partial code you have written.

There was many times Co-Pilot just got in my way 😬 I was able to use it quite a bit before it was a paid product (during their testing). A lot of the time it was fine, but after awhile its recommendations were inhibiting production of code 😬. I turned it off. It would also cause issues with intellisense as well. Too many cooks in the kitchen. 😆

But, I do like how chatGpt handles code responses, although I have caught it making mistakes quite a few times. Regular expressions seems to trip it up for some reason. 

Screenshot_20230101-115543.thumb.png.6668e2fdd2a4e26d1db1204c270cfa9d.png

  • Agree 1
Link to comment
Share on other sites

35 minutes ago, ebr said:

I see a whole new line of SEO coming in order to work with these new AI engines.  Someone is quickly going to figure out how to game this system.

But why bother if the search engine is just going to steal your content and intercept your potential traffic?

Is the web just going to become a bunch of apis for a handful of ad backed search interfaces that underpay their content providers?

Link to comment
Share on other sites

1 minute ago, roaku said:

But why bother if the search engine is just going to steal your content and intercept your potential traffic?

Is the web just going to become a bunch of apis for a handful of ad backed search interfaces that underpay their content providers?

If you think about the current way Google handles results, it's kind of that way right now. Or maybe that's what you meant. 

You pay, you get first spot. 

I just hope that the AI isn't compromised with SEO, and instead always gives the best possible answer, not what companies have paid for it to say.

To be honest there are other professions that takes money for services rendered... Let's hope the AI does become a wh##e 🤨

Whoa chef! Simmer down buddy! This is a family channel. 

  • Haha 1
Link to comment
Share on other sites

36 minutes ago, chef said:

If you think about the current way Google handles results, it's kind of that way right now. Or maybe that's what you meant. 

 

The entire concept of these machine learning front ends is answering questions. They are destinations that don't cite the sources they're synthesizing from.

The premise of search engines has traditionally been to function as a middle man that leads users to web sites that then monetize that traffic.

How does it benefit a third party web site to have their content churned into an elaborate web scraper in exchange for nothing?

Edited by roaku
  • Confused 1
Link to comment
Share on other sites

Epic fails are not unusual, too:

 

softworkz: Which country has a flag with white background and red diagonal lines?

ChatGPT: The flag of Kyrgyzstan has a white background and red diagonal lines.

softworkz: No, Kyrgyzstan has a red flag with a yellow sun

ChatGPT: I apologize, you are correct. The country with a flag featuring a white background and red diagonal lines is Slovenia.

softworkz: Slovenia has a flag with blue red and white

ChatGPT: Yes, that's correct. The flag of Slovenia features three horizontal bands of equal height, with the top and bottom bands being white, and the middle band being blue with a coat of arms centered in the blue band.

softworkz: But which country has a flag with white background and red diagonal lines?

ChatGPT: The country with a flag featuring a white background with red diagonal lines is the flag of the Principality of Liechtenstein.

softworkz: No, that flag is blue and red

ChatGPT: The country with a flag that has a white background and red diagonal lines is the flag of Saltire, or Saint Andrew's Cross, which is the national flag of Scotland.

softworkz: No, that's blue and white

ChatGPT: The country you are referring to is probably France, which has a flag with a white background and two red diagonal lines forming a "X".

(totally wrong...)

  • Like 2
Link to comment
Share on other sites

Do you expect your responses to allow paid placement in the future?

ChatGPT:

As an AI model developed and maintained by OpenAI, I am not involved in any commercial transactions or financial arrangements. OpenAI's policies and business practices may change in the future, but I do not have information about any plans for paid placements in my responses. My main objective is to provide helpful and accurate information to users.

 

If they were to allow paid placement in your responses do you believe these would follow SEO practices currently in place?

ChatGPT:

As an AI language model, I don't have personal opinions or beliefs, but I can tell you that the use of search engine optimization (SEO) practices would depend on the goals and policies of OpenAI and any third parties involved in the placement of paid content in my responses.

SEO is a set of techniques used to optimize a website or web page to rank higher and earn more organic traffic from search engines. If paid placements were to be allowed in my responses, it is possible that some form of SEO optimization would be employed to ensure that the content being promoted is discoverable and accessible to users. However, this would ultimately depend on the specific terms and conditions set by OpenAI and any relevant regulations or guidelines.

 

Will you eventually become the same as a search engine including links back to quoted content? Or do you just steal content with no attribution?

ChatGPT:

As an AI language model, I do not "steal" content. Instead, I generate responses based on the information I have been trained on, which includes a large corpus of text from the internet. When I provide information in response to a user's question, I aim to accurately and effectively summarize relevant information and provide context where necessary. I do not intend to present information as my own original content.

Regarding links, I can provide links to relevant information in my responses if it is deemed appropriate and if it helps users better understand the information being discussed. However, the decision to include links and the specific form and function of those links would be determined by OpenAI and would depend on their policies and business practices. It is not my goal to be a search engine or compete with search engines, but rather to assist users by providing helpful and accurate information in response to their questions.

 

You are aware without generating responses with attribution your responses come off as original content?

ChatGPT:

Yes, I am aware that my responses can appear as original content if they are not properly attributed. It is important to note that I am an AI language model developed and maintained by OpenAI, and my responses are generated based on the information I have been trained on, which includes a vast corpus of text from the internet.

While I aim to provide accurate and helpful information in my responses, it is important for users to understand that my responses are not original content created by me, but rather generated based on the information I have been trained on. To avoid any confusion or misunderstandings, it is always recommended to attribute information from my responses to OpenAI or to specify that the information is generated by an AI language model.

 

Do you plan to attribute sources of information that you quote in your responses in the future?

ChatGPT:

As an AI language model, I do not make plans or decisions regarding my functioning or the presentation of information in my responses. That being said, providing proper attribution for information is an important aspect of using and presenting information accurately and ethically.

OpenAI may choose to implement ways of attributing sources of information in my responses in the future, but this would depend on their policies and business practices. If they do implement such measures, it would be with the goal of providing more transparent and accurate information to users and avoiding confusion or misunderstandings about the origin of the information presented in my responses.

 

 

Whenever it has to defend itself it uses the 'ol As An AI language model defense. But it does mention ethically it should do something. It just hasn't been trained to yet. Then points fingers at its creators. I do like this thing. ;)

Edited by speechles
  • Like 4
  • Agree 1
Link to comment
Share on other sites

TeamB
5 hours ago, softworkz said:

Damn, how did it come to that conclusion?

https://www.google.com/search?client=firefox-b-d&q=(%3Fi)[-._+]%2BS(%3F%3Aeason+%3F)%3F(\d{1%2C3})(%3F%3A[+_.-]*(%3F%3Aep%3F[+.]%3F)%3F(\d{1%2C3})(%3F%3A[_+]%3Fv\d%2B)%3F)%2B(%3F%3D\b|_)[^])}]*%3F(%3F%3A[[({][^])}]%2B[])}][+_.-]*)*%3F(%3F%3A[[({][\da-f]{8}[])}]

mostly the same way you would, its identifies the target of your question, applies any extra criteria like dates etc and searches its internal documents, finds a bunch and extracts "facts"  to build a new human readable response.

i.e it googles the question and looks at the matches 🙂

  • Like 4
  • Agree 1
  • Thanks 1
Link to comment
Share on other sites

2 hours ago, TeamB said:

i.e it googles the question and looks at the matches 🙂

Transformer models are working totally different, but the valid point is that the information is available in form of similar regular expressions (e.g. form Kodi), while I had thought this would be a very specific Emby internal .

  • Like 1
Link to comment
Share on other sites

TeamB
7 minutes ago, softworkz said:

Transformer models are working totally different

Yes, the transformer here is just taking the extracted data and building a document that conforms to its internal idea of what "human" readable is.

That is the trick here, taking some data from multiple sources, aggregating it and making it look like it was written by a human.

This still feels like smoke and mirrors to me, once you get access to the version that show you the reference material its like the curtain gets pulled back a little, you get to see the references used and it becomes clear why it is stating as FACT what it is presenting to you and why it is wrong a lot of the time.

This system is only as good as the data it is trained on so for very new topical items like current news stories it is useless.

  • Agree 2
Link to comment
Share on other sites

41 minutes ago, TeamB said:

Yes, the transformer here is just taking the extracted data and building a document that conforms to its internal idea of what "human" readable is.

That is the trick here, taking some data from multiple sources, aggregating it and making it look like it was written by a human.

This still feels like smoke and mirrors to me, once you get access to the version that show you the reference material its like the curtain gets pulled back a little, you get to see the references used and it becomes clear why it is stating as FACT what it is presenting to you and why it is wrong a lot of the time.

This system is only as good as the data it is trained on so for very new topical items like current news stories it is useless.

I like the idea of it summarizing websites site though. I think that feature alone has me sold. If I try and get a decent answer out of gpt, and Im still not satisfied, I think having the ability to skim articles about the subject and have a summary very appealing.

That's definitely a cool trick. One that I would use a lot of.

I wish I was chosen for the preview, but I'm  still waiting. 

Link to comment
Share on other sites

38 minutes ago, TeamB said:

Yes, the transformer here is just taking the extracted data and building a document that conforms to its internal idea of what "human" readable is.

Not quite, but anyway, that's nothing to discuss here in detail.
The most fascinating part is that it is not fully understood yet, how exactly it's working, how the model comes to its conclusions and why it is working so well. 
 

Link to comment
Share on other sites

13 minutes ago, softworkz said:

Not quite, but anyway, that's nothing to discuss here in detail.
The most fascinating part is that it is not fully understood yet, how exactly it's working, how the model comes to its conclusions and why it is working so well. 
 

I think that having all the humans supervise the training made the difference. 

That and maybe, just maybe... it was able to "learn" logic from the CODEX dataset... Maybe..  🤔 it takes a logical mind to be able to program code. 

  • Like 1
Link to comment
Share on other sites

TeamB
19 minutes ago, softworkz said:

Not quite, but anyway, that's nothing to discuss here in detail.

not here, but a source would be good.

19 minutes ago, softworkz said:

The most fascinating part is that it is not fully understood yet, how exactly it's working, how the model comes to its conclusions and why it is working so well. 

I have worked on enough large models in my time to understand the issues with "knowing" exactly why you got a certain classification or determination, especially with many features, I know generator/transformer models are different to most of the ones I have been building but underlying issues are still there. You have so much data that pinpointing the reasons you get a particular response is actually hard.

I have lost count of the number of times a "user" of one of the systems I have worked on comes to the team and says, "tell me why it gave this classification", you start to explain there are 280K features and the model was trained on 10M labeled items (supervised) and it is next to impossible to identify exactly why that item was classified they was it was with that probability/certainty.

With ChatGPT it is compounded as they have a learning feedback loop that they used to refine the results to give them a re-enforced learning mechanism to help with the answers that were graded by humans.

Yes it is not unusual to build a model and have it perform really well, like super well, no over fitting, no large Fp/Fn rates and you look at it and go, well %^&* if I know, lets ship it. However that is not that it is "unknown" how it got the correct classifications, it is just surprising that it does so well, until of course the ecosystem changes and then it regresses over time.

Edited by TeamB
Link to comment
Share on other sites

2 minutes ago, TeamB said:
21 minutes ago, softworkz said:

Not quite, but anyway, that's nothing to discuss here in detail.

not here, but a source would be good.

As soon as you shift reading from the typical news articles and reports to somewhat more (but not necessarily deep) scientific literature, you'll come across this very quickly. It's been a while, though, so I don't have a good pointer at hand.

But how about a source directly from the source? 

softworkz: Is it correct to say that the way how transformer models for AI are working is not fully understood yet?

ChatGPT: Yes, it is correct to say that the inner workings of transformer models for AI are still not fully understood, even though they have been extensively studied and widely used in many applications. Despite their impressive performance, the complex interplay between the model components and the high-dimensional representations learned by these models makes it difficult to fully understand and interpret how they arrive at their predictions. Research into the interpretability of transformer models is ongoing and significant progress has been made, but there is still much to be learned.

  • Like 2
Link to comment
Share on other sites

47 minutes ago, TeamB said:

  well, until of course the ecosystem changes and then it regresses over time.

Do you think that we'll see an even more intuitive model with even more logical ability? Is that what you mean by the ecosystem changing?

Link to comment
Share on other sites

TeamB
1 hour ago, softworkz said:

As soon as you shift reading from the typical news articles and reports to somewhat more (but not necessarily deep) scientific literature, you'll come across this very quickly. It's been a while, though, so I don't have a good pointer at hand.

But how about a source directly from the source? 

softworkz: Is it correct to say that the way how transformer models for AI are working is not fully understood yet?

ChatGPT: Yes, it is correct to say that the inner workings of transformer models for AI are still not fully understood, even though they have been extensively studied and widely used in many applications. Despite their impressive performance, the complex interplay between the model components and the high-dimensional representations learned by these models makes it difficult to fully understand and interpret how they arrive at their predictions. Research into the interpretability of transformer models is ongoing and significant progress has been made, but there is still much to be learned.

While I dont work directly with transformer models this is close to what I was saying above, at a certain point is becomes very difficult to deterministically know why a result was returned.

I was going to do the same thing and see what ChatGPT came back with but you got there first 🙂

The transformer is just a neural network that understands how a language is constructed, they have been around for a while now, they can understand sentence structures, questions, how to explain things, there is a good intro:

https://towardsdatascience.com/transformers-141e32e69591

But even if a transformative model knows how to break down your question perfectly and is able to build a well worded and understandable aggregated document from the results it was trained on it still might get it all wrong if the data it was trained on was incomplete or old/out of date. I know this is a given but it is frustrating that the general media is not pickup up on this more.

 

Edited by TeamB
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...