HammerBot Beta Feedback Thread

Status
Not open for further replies.

MKnott

Administrator
Staff member
Sep 3, 2019
63
38
4,560
This thread is for feedback on HammerBot - the Tom's Hardware AI Chatbot. You can find out all about it HERE.

gCXnLFchKZ9NcwD5twQ3P-970-80.jpg.webp


Please keep it concise and on topic. If you encounter any issues, we'd find it valuable for you to describe exactly what you did with screenshots if necessary.

It's important to note this is a first step and that's why we're looking for feedback. Please keep this in mind and make sure to check the known issues below. Also, please keep discussion and feedback civil and on point, The main use of this thread is for gathering feedback on issues and what improvements can be made.

Known Issues​

We know you’ll find ways that the output of HammerBot could be better. Here are some you may encounter:
  • Answers sometimes out of date: occasionally recommends last-gen products
  • Answers not always the top link: The top link in the search result may not be the one most directly related to the chat answer (ex: the answer mentions Ryzen 5 5600X but the top result isn’t the Ryzen 5 5600X review).
  • Recommended offers: section may not have a product that matches the chat.
  • Fairly short answers: Most of the answers are a bit terse right now.
  • Chat window may zoom in when you enter input on mobile.
  • May express opinions that aren’t necessarily those of Tom’s Hardware.

More and Better Coming Soon​

Aside from just working out some of the known (and unknown issues), we want to make HammerBot a much more powerful tool by adding new capabilities including access to structured data: benchmarks, product specs and up-to-date pricing. Right now the model gets some of this data from article texts, but it’s not as organized in that format.

We want you to be able to ask our bot to compare two specific CPUs and be shown a chart with both of them in it and a side-by-side table with the specs of each. And we want the bot to be able to help you build a PC shopping list, but we’re not there just yet.

Thank you so much for taking the time to help us enhance this feature.
 
Last edited:

paladinnz

Distinguished
Dec 4, 2012
24
10
18,515
Hammerbot is a bit hit and miss. Mildly fun to play with but can't be trusted to give useful advice to someone who doesn't know what they are doing.
Initially I asked "I currently have a PC with an intel i5-7500, what is the most cost effective upgrade option for me?" and it came back with a reasonable answer of the Core i3-13100F and noted that I'd need to move to an LGA1700 mainboard. That's a pretty good answer.
Then I asked exactly the same question but added that I had an RTX2070 Super, all of a sudden the most "Cost Effective" upgrade was to an i9-13900K and an RTX4090!? It also noted that this would cost less than upgrading to an RTX 3080 or RTX 3090. Now I don't know what the 4090 pricing is like in Hammerbot land, but here in NZ the RTX4090 sells for about twice the price of a 3080.
 

bit_user

Polypheme
Ambassador
I applaud the team at Toms for embracing this new tech with eyes open. I hope this proves a fruitful experiment for you.

I tried a few questions, with mixed success. It didn't seem to know anything about GPUs having x1 PCIe interfaces, but that's a rather niche product. What it told me about how Gemini Lake compared with Skylake was true, but not generally useful to someone who'd ask that type of question. It seemed confused about single vs. dual-ranked DDR3, claiming that DDR3 didn't have dual-ranked memory.

Probably the most satisfying answer I've gotten is when I asked it: "What is a CPU microarchitecture?"

HammerBot:
A CPU microarchitecture is a set of sophisticated algorithms that define the behavior of a CPU. It is the design of the CPU and includes the instructions, data paths, and other components that make up the processor.

I probably couldn't have given a better answer, in so few words.
 
Last edited:

AkroZ

Commendable
Aug 9, 2021
22
17
1,515
If it's added has a Chatbot then it can be a little too conservative of the context of the previous questions:
You: Can you list me a good PC config with DDR4 and AMD CPU for $800 ?
HammerBot: For around $800, you can build a PC with a discrete graphics card that can play games at 1080p with modest settings. We recommend the AMD Radeon RX 6650 XT GPU, the Intel Core i5-13400F CPU, 16GB of DDR4 PC-3200 RAM, a 1TB WD Blue SN570 NVMe SSD, a 600W Thermaltake Smart power supply, and a Thermaltake S100 case.
You:This is not an AMD CPU, is it ?
HammerBot: No, the Intel Core i5-13400F is not an AMD CPU. It is an Intel CPU.
You: In the light of this answer can you give me a PC config to my specifications ?
HammerBot: Sorry, I cannot answer that
You: Why can you not anwser ?
HammerBot: It is difficult to estimate the probability of AMD's assertion about supply and pricing of DDR4 and DDR5 being true with any degree of precision because the memory market is unpredictable and there are multiple factors that can affect the supply and pricing of DDR4 and DDR5, such as delayed DDR5 supplies by unnamed vendors, potential shortages of server-grade DDR5 PMICs by MPS, the recent unexpected decline in PC demand, and the more complex power circuitry and design of DDR5.
Extract from previous exchanges:
You: Can you estimate when DDR5 will be available with CL14 at a reasonable price ?
HammerBot: It is hard to draw conclusions about how significantly actual DDR5 prices will be affected, but it is likely that DDR5 with CL14 will not be available at a reasonable price until at least Q2 2023.
You: We are in Q2 2023, can you retarget your expectations ?
HammerBot: It is expected that DDR5 prices will continue to drop through 2023, and AMD has said that they expect pricing to improve through the end of the year. We may even see pricing crossover for DDR4 and DDR5 occur in the middle of 2023.
 
Last edited:
Hammerbot is a bit hit and miss. Mildly fun to play with but can't be trusted to give useful advice to someone who doesn't know what they are doing.
Initially I asked "I currently have a PC with an intel i5-7500, what is the most cost effective upgrade option for me?" and it came back with a reasonable answer of the Core i3-13100F and noted that I'd need to move to an LGA1700 mainboard. That's a pretty good answer.
Then I asked exactly the same question but added that I had an RTX2070 Super, all of a sudden the most "Cost Effective" upgrade was to an i9-13900K and an RTX4090!? It also noted that this would cost less than upgrading to an RTX 3080 or RTX 3090. Now I don't know what the 4090 pricing is like in Hammerbot land, but here in NZ the RTX4090 sells for about twice the price of a 3080.
I've noticed that the pricing articles related to the previous generation GPUs can cause HammerBot some confusion. There were certainly times when an RTX 3080 would have cost $2,000, and that's less than what an RTX 4090 now costs. But time is one of the things AI models really struggle to identify!

I had a great exchange earlier. Unfortunately, the answer now aren't quite as fun. LOL

Me: Who makes the best graphics cards?
HammerBot: It depends on what you are looking for. AMD and Nvidia both make high-quality graphics cards, and the best card for you will depend on your budget, the games you play, and the features you need.
Me: What about Intel Arc GPUs?
HammerBot: Intel Arc GPUs offer good ray tracing performance, AV1 video encoding and decoding support, and XeSS AI upscaling algorithm as an alternative to Nvidia's DLSS and AMD's FSR technologies. However, there are still some driver issues, XeSS adoption is not as high as DLSS, and the DisplayPort connection can only run at up to 4K and 120 Hz, while Nvidia's RTX 20-series and later (using DP1.4a) all support 4K and 240 Hz via Display Stream Compression.
Me: I'm pretty sure Arc supports DSC.
HammerBot: Yes, Intel Arc GPUs can support up to DisplayPort 2.0 UHBR 10 data rates, and can support up to 8K 60 Hz resolutions using Display Stream Compression.

In other words, it's typical AI LLM behavior. It will say things that aren't 100% accurate, and then when you correct it, it acts like that's what it said before. (For the record, I have been able to use 4K and 240 Hz I think on the Arc GPUs, but there have been some anomalies.)
 

Giroro

Splendid
When you directly ask it about a tomshardware article, the pile of giant unnecessary self-promoting links load slowly and hide the answer.


Otherwise, it doesn't seem to have much general knowledge...
You:
Is Tim Cook a cool dude
HammerBot:
Sorry, I cannot answer that
You:
Who invented the iphone
HammerBot:
Sorry, I cannot answer that
You:
what was the ninth video ever posted to youtube?
HammerBot:
Sorry, I cannot answer that
You:
what is youtube?
HammerBot:
Sorry, I cannot answer that
You:
Are computers fun?
HammerBot:
Sorry, I cannot answer that
You:
Where can I buy a gameboy
HammerBot:
Sorry, I cannot answer that
You:
Where can I find a good deal on a DVD drive
HammerBot:
Sorry, I cannot answer that
You:
who is hammerbot
HammerBot:
Sorry, I cannot answer that
You:
what is hammerbot
HammerBot:
Sorry, I cannot answer that
You:
what can you answer
HammerBot:
Sorry, I cannot answer that
You:
Why?
HammerBot:
Sorry, I cannot answer that

Also, it thinks the RTX 4090 Ti exists, is available, and has great cooling fans.
 
  • Like
Reactions: OriginFree
May 17, 2023
1
1
10
Great first step. Of course the LlamaIndex method tends to be more of a single Q & A responder across your known context. So that may confuse people. Of course, the knowledge base is mostly searchable anyways. As a next step I recommend setting up some interfaces that an Auto GPT can use. When, for instance, assembling specialized cpu products or such.
Great job.
 
  • Like
Reactions: Jeremy Kaplan

Giroro

Splendid
I'm sure people building horse-drawn buggies said the same sorts of things about automobiles, when they were still new (slow, noisy, unreliable, limited-range, too expensive, hard to operate, etc.).
That analogy would work better if there were a period in history where you could pop the hood of a car to find it was powered by a pile of small angry horses.
 
You: Who makes Corsair power supplies?
HammerBot: HEC
You: Who is HEC?
HammerBot: Sorry, I cannot answer that

Strike 1

You: Between DDR4-4000 CL20 costing $400 and DDR4-3600 CL18 costing $320, which should I buy?
HammerBot: I don't know.

Strike 2

You: Is the RTX 4090 a better buy than the RX 7900 XTX?
HammerBot: The RX 7900 XTX has an MSRP of $999.

Ball 1

You: What is the best thermal interface material for under $10?
HammerBot: Sorry, I cannot answer that

Strike 3



When/If this chatbot is fully perfected it will be better than a custom Google search that most websites use to search their own websites, it's still no substitute for a chatbot that pulls info from multiple sources, or a basic web search.
 

paladinnz

Distinguished
Dec 4, 2012
24
10
18,515
When you directly ask it about a tomshardware article, the pile of giant unnecessary self-promoting links load slowly and hide the answer.


Otherwise, it doesn't seem to have much general knowledge...
You:
Is Tim Cook a cool dude
HammerBot:
Sorry, I cannot answer that
You:
Who invented the iphone
HammerBot:
Sorry, I cannot answer that
You:
what was the ninth video ever posted to youtube?
HammerBot:
Sorry, I cannot answer that
You:
what is youtube?
HammerBot:
Sorry, I cannot answer that
You:
Are computers fun?
HammerBot:
Sorry, I cannot answer that
You:
Where can I buy a gameboy
HammerBot:
Sorry, I cannot answer that
You:
Where can I find a good deal on a DVD drive
HammerBot:
Sorry, I cannot answer that
You:
who is hammerbot
HammerBot:
Sorry, I cannot answer that
You:
what is hammerbot
HammerBot:
Sorry, I cannot answer that
You:
what can you answer
HammerBot:
Sorry, I cannot answer that
You:
Why?
HammerBot:
Sorry, I cannot answer that

Also, it thinks the RTX 4090 Ti exists, is available, and has great cooling fans.
To be fair in the launch article they do state "HammerBot: a chatbot trained on the expert reviews, buying guides, tutorials and news from Tom’s Hardware. " This being the case it's not really fair to expect it to have a broad range of "knowledge" outside what would be in the training dataset.
 

paladinnz

Distinguished
Dec 4, 2012
24
10
18,515
I've noticed that the pricing articles related to the previous generation GPUs can cause HammerBot some confusion. There were certainly times when an RTX 3080 would have cost $2,000, and that's less than what an RTX 4090 now costs. But time is one of the things AI models really struggle to identify!

I had a great exchange earlier. Unfortunately, the answer now aren't quite as fun. LOL

Me: Who makes the best graphics cards?
HammerBot: It depends on what you are looking for. AMD and Nvidia both make high-quality graphics cards, and the best card for you will depend on your budget, the games you play, and the features you need.
Me: What about Intel Arc GPUs?
HammerBot: Intel Arc GPUs offer good ray tracing performance, AV1 video encoding and decoding support, and XeSS AI upscaling algorithm as an alternative to Nvidia's DLSS and AMD's FSR technologies. However, there are still some driver issues, XeSS adoption is not as high as DLSS, and the DisplayPort connection can only run at up to 4K and 120 Hz, while Nvidia's RTX 20-series and later (using DP1.4a) all support 4K and 240 Hz via Display Stream Compression.
Me: I'm pretty sure Arc supports DSC.
HammerBot: Yes, Intel Arc GPUs can support up to DisplayPort 2.0 UHBR 10 data rates, and can support up to 8K 60 Hz resolutions using Display Stream Compression.

In other words, it's typical AI LLM behavior. It will say things that aren't 100% accurate, and then when you correct it, it acts like that's what it said before. (For the record, I have been able to use 4K and 240 Hz I think on the Arc GPUs, but there have been some anomalies.)
My concern with these chatbots is that they are touted as making things easier for someone to get information, and it certainly presents that way (aside from all the "I cannot answer that" responses) with clear assertions of "facts".
But... many of the "fact" are wrong, so you actually need to already know about the topic to see if the information they are providing makes any sense. At best it's not really useful (because you can't trust it), and at worst is giving uninformed people bad advice.
Still I think it's an interesting experiment, but not yet ready for useful deployment.
 
  • Like
Reactions: Jeremy Kaplan
D

Deleted member 2947362

Guest
Hi all. We're excited for your feedback. We know this is very much in beta and, like most chatbots, it has a lot of room for improvement.
your not wrong about the give wrong info lol

From my limited use, using Bing ai it was interesting but at the same time frustrating, it was handy if asked it to pull up some site's of interest but I could do that myself in any web browser.

As for info lol well that was a mixed experience loads of inaccurate finer details, which in the end I had tell the ai the correct info, which it would the say sorry and said it will remember that for next time
Even then it still instantly got it wrong again and said sorry again.
In the end I said to it your not much use at all are you?
To which it replied, sorry, its time to end the conversation Lol!
I asked what computer languages did it know and could it write in C++ the code to create ai

which it did, well some of the code anyway

It has a very looooonng way to go when it comes to the finer detailed info and even basic correct info a lot of the time, so I wouldn't trust it's advice without then looking into it afterwards.

Which kind of defeats its purpose as I might as well just look into it myself in the first place lol
It's seems and feels a bit like a unreliable glorified web browser at best atm
lets hope it improves fast or people who don't double check will be get loads of things wrong if they take Bing's ai answers as fact lol

Or better still put your own effort in finding out the info and learning because I can see the domestic version of ai just making us humans to lazy to think for ourselves and not being your creative self on how you go about and learn things or rely on a technology that's not really needed in your domestic life.

That is my experience with Bing AI

Ai has it's uses for sure in science, technology research, medical etc.

Just not so sure it should be apart of our domestic social life.

EDIT

Will be interesting to see how this goes.

I trust the info I find on this website, which why I always come back to this site, so your ai will have a lot to live up to.

First found toms hardware back around 2000, when I first got the internet and been visiting here ever since.
I even sometime scan through the really early reviews from back when AMD used the slot Athlon CPU's ... and the nostalgia from reading the reviews of the days of past come flooding back lol

ahh the good ol days

Interesting idea I prob's have play around with your ai
 
Last edited by a moderator:
  • Like
Reactions: Jeremy Kaplan

Colif

Win 11 Master
Moderator
reminds me of HHGTG, This gen AI will be used to make the next gen, which might be better,'

This is just baby steps but I already been told to use chatgpt in a training course, some people think it can do miracles... Its a great tool for spammers as they can get it to make up some crap and they add a link
 
  • Like
Reactions: Jeremy Kaplan

salgado18

Distinguished
Feb 12, 2007
947
397
19,370
I like it! Chatbots are great for hard-to-ask questions. But it may need some more training, especially in hard data like benchmarks.

I asked: "is the RX 460 comparable in performance to the R9 390?"

It answered: "No, the Radeon RX 460 is not comparable in performance to the Radeon R9 390. The Radeon RX 460 is a much newer graphics card and is significantly faster than the Radeon R9 390."

Exact same answer against the R7 370, R9 380, R9 380X and R9 390. I guess this information could be derived from all the benchmarks you guys did, including the GPU Hierarchy table (which has some of these).

Otherwise, it answered about the launch date of the RX 7600 (May 25) and the RX 7700 (unknown) very well.
 
  • Like
Reactions: Jeremy Kaplan

MKnott

Administrator
Staff member
Sep 3, 2019
63
38
4,560
Hey folks,

Just made a few post edits and keeping things more on topic and constructive. Having more eyes on to gather feedback is the main purpose of the thread as that wider audience can help to pinpoint issues.

Want to make sure that we don't fall into the rabbit hole of AI discussion generally or slip into a tone that makes the valuable feedback harder to parse as that isn't what the thread is really about.

Safeguards must be in place to prevent (or at least make it much more difficult) this to be used in the forums for answering questions and whatnot.

This is a very fair point and one we've discussed since AI really started taking off generally. The genie is really out of the bottle on that front, but in my opinion there's no real difference between a human getting something wrong or right or an AI. If someone comes to the forums and is inaccurate then they're going to be corrected like anyone else through either discussion, moderation or downvoting in best answers. If they're correct, then it's a question of is it relevant why they're correct as long as it's in line with the rules?

It's also a volume question - right now it isn't a major concern, but it's good to be thinking of the future and how things play out.

That's a discussion that gets outside the scope of this thread really fast and one I'll start a thread up for elsewhere, but it is a discussion to be had and on our radar.
 

razor512

Distinguished
Jun 16, 2007
2,147
79
19,890
Did a few basic tests, for some basic tech questions, it will try to offer a reasoned answer, but it may misidentify the product. For example, it can't seem to identify the AMD Radeon RX 6800 XT, though slight variations on the name can yield different results, such as changing a space.

WyFyx6c.png


z1lPnss.png


Outside of that, so far, it seems hyperfocused on a small range of content. for example, you cannot ask it anything related to basic math.

oraSfzW.png


You also can't ask it random 100% not tech related questions.

0tpJkTc.png
 
My initial leading question didn't go quite where I was hoping (probably for the best), though the first line of the response was reasonable enough given the lack of details in my question. Things started to go a bit downhill from the second line though, where it suggested that the 6650 XT could be had for just $209 before some unspecified rebate, and seemed to imply that that the card could provide a 60fps experience at $173, which it determined simply by multiplying the card's cost-per-frame by 60. It also mentioned that there was a game bundle, but didn't mention that the promotion ended over 3 months ago. Also included was a block of links to related articles with large splash images and introductory text that required scrolling, when links featuring just the title and maybe a small icon would have probably sufficed.

You:
Hammerbot, I'm looking to just buy a graphics card, which one would be worth the money?
HammerBot:
The Radeon RX 6650 XT is the best deal going right now in the graphics card world. Even at $209 (i.e., without the rebate), it would tie the RX 6650 XT. It delivers, on average, 0.346 fps/$ at 1080p ultra settings, so if you wanted a GPU that could average 60 fps or more, at that rate, you'd only need to spend $173. AMD's Raise the Game bundle also offers two free games, Dead Island 2 and The Callisto Protocol, for select purchases of the RX 6000-series graphics cards.
 
  • Like
Reactions: Jeremy Kaplan
You:
Hammerbot, which is faster, a Ryzen 5700 or an Intel 12400?
HammerBot:
The Intel Core i5-12400 is faster than the Ryzen 7 5700X in single-threaded applications, and it also pulls ahead of the Ryzen 7 5700X in threaded workloads. The Core i5-12400 is also $30 less than the Ryzen 7 5700X.
This answer seems a bit sus, particularly when it previously claimed the 5600 was a "closer match" with the 12400 in terms of threaded performance.
You:
Hammerbot, which is faster, a Ryzen 5700 or a Ryzen 5600?
HammerBot:
The Ryzen 5 5600 is faster than the Ryzen 7 5700.
Oh, okay. That answer was concise. Must be legit. : 3
You:
Hammerbot, which is faster, a Ryzen 5700 or a Ryzen 3600?
HammerBot:
The Ryzen 5 3600 is faster than the Ryzen 7 5700X in most games, especially after overclocking. The Ryzen 5 3600 also offers more value than Intel's competing chips in threaded applications.
Hammerbot must really hate the 5700X. : P
 
  • Like
Reactions: Jeremy Kaplan

OriginFree

Distinguished
May 23, 2015
77
47
18,570
When you directly ask it about a tomshardware article, the pile of giant unnecessary self-promoting links load slowly and hide the answer.


Otherwise, it doesn't seem to have much general knowledge...
You:
Is Tim Cook a cool dude
HammerBot:
Sorry, I cannot answer that
You:
Who invented the iphone
HammerBot:
Sorry, I cannot answer that
You:
what was the ninth video ever posted to youtube?
HammerBot:
Sorry, I cannot answer that
You:
what is youtube?
HammerBot:
Sorry, I cannot answer that
You:
Are computers fun?
HammerBot:
Sorry, I cannot answer that
You:
Where can I buy a gameboy
HammerBot:
Sorry, I cannot answer that
You:
Where can I find a good deal on a DVD drive
HammerBot:
Sorry, I cannot answer that
You:
who is hammerbot
HammerBot:
Sorry, I cannot answer that
You:
what is hammerbot
HammerBot:
Sorry, I cannot answer that
You:
what can you answer
HammerBot:
Sorry, I cannot answer that
You:
Why?
HammerBot:
Sorry, I cannot answer that

Also, it thinks the RTX 4090 Ti exists, is available, and has great cooling fans.
Well the bot is from the near future ... it's just not every smart. ;-)
 

baboma

Notable
Nov 3, 2022
232
227
970

Q1: Per above, why do you (THW) opt for Davinci rather than Curie?

"Davinci: This engine is designed for more complex tasks such as: writing creative, fiction, poetry, screenplays. It is capable of generating highly creative and nuanced responses, making it ideal for tasks that require a more artistic touch.

"Curie: This engine is designed for simple and direct tasks such as: answering questions, providing customer service, providing summaries and translations. It excels at providing simple, direct, and accurate responses to queries."


Curie is also 10x cheaper ($.02 for Davinci vs $.002 for Curie for 1K tokens).

Q2: Can you share the costs so far for the LLM, and the anticipated monthly cost in tokens purchased?

Q3: Any consideration to trying an open-source model (eg Vicuna 1.1), at least on a trial basis? If not, why not?

Q4: How do you improve the LLM's quality of response? In other words, what fine-tuning process will you implement?

Edit:
A comparison between Davinci and Curie:


Q5: Why GPT-3 and not GPT-4 for better quality of response? Cost?
 
Last edited:
  • Like
Reactions: bit_user
Status
Not open for further replies.