Hacker Newsnew | past | comments | ask | show | jobs | submit | adg29's commentslogin

Evaluation dataset designed to test the capabilities of Retrieval-Augmented Generation (RAG) systems. Paper with details and experiments is available on arXiv: https://arxiv.org/abs/2409.12941.

Dataset Overview 824 challenging multi-hop questions requiring information from 2-15 Wikipedia articles Questions span diverse topics including history, sports, science, animals, health, etc. Each question is labeled with reasoning types: numerical, tabular, multiple constraints, temporal, and post-processing Gold answers and relevant Wikipedia articles provided for each question

Key Features Tests end-to-end RAG capabilities in a unified framework Requires integration of information from multiple sources Incorporates complex reasoning and temporal disambiguation Designed to be challenging for state-of-the-art language models

Usage This dataset can be used to:

Evaluate RAG system performance Benchmark language model factuality and reasoning Develop and test multi-hop retrieval strategies


“Facebook users, randomized to deactivate their accounts for 4 weeks in exchange for $102, freed up an average of 60 minutes a day, spent more time socializing offline, became less politically polarized, and reported improved subjective well-being”

In my experience Facebook offers polarization and information overload on one side, or a barrage of memes and random nostalgia on the other. I’ve given up on trying to manage what is recommended to me there, although for a while I did purge my friends and likes carefully, to some effect.

I don’t go so as far as deleting Facebook but it’s definitely less a resource and more an entertainment / distraction vehicle


Mapbox enables interactive storytelling using low-code template to help tell map-based stories

https://www.mapbox.com/solutions/interactive-storytelling


Update:

Department of Homeland Security sources say the site never crashed or seemingly was in any danger of doing so. https://twitter.com/nakashimae/status/1239569102831857665

According to Bloomberg, officials don’t yet know who is responsible but are assuming it’s a “hostile foreign actor.”


Intrigued by hearing that “many of us saw this coming”. Is there a frontier of folks in opposition to censorship of targeted platforms?


This headline strikes me as a shift burdening users themselves with finding bad actors. Even the statement from e.g. Twitter, make it seems so:

"We think it’s important for people to be aware that this exists out there and that they review the apps that they use to connect to their accounts,” said Lindsay McCallum, a Twitter spokeswoman."

But I would argue that users who fall for these nefarious services should not be the lookout. Instead the trust should be placed on teams at Facebook and Twitter that vet the bad actors, e.g. oneAudience.

I understand vulnerabilities abound and moderation is hard, and educating users is important. I'm just irked a bit that the accountability is shifted here.


My guess is that the most outspoken and principled activists among the Google workforce will find themselves singled out. If they navigate an exit wisely it could be great for them personally and the industry at large. The upside is that by managing an exit from adverse working environments, their passion experience and pedigree will likely fuel some great upstarts.

I really hope these people choose their battles wisely and use their force for good. Not to transform a firm, but rather to disrupt the status quo of an entire industry, without needing association to a name like Google or FB.


“So if you're sending one email that you wrote in 10 seconds to 20 people, you're not spending 10 seconds, but more like 20 minutes of resources: wouldn't it be better to work 5 minutes to find a solution yourself?”

I aim to be the person who will take 5 minutes to find a solution rather than seeking it via email or tweet. Evaluating my time vs others’ time and attention is habitual throughout my career. I pride myself in being capable and considerate. But many corporate, office, or otherwise networked cultures will drown out a person’s focused effort and wins among the outspoken or otherwise visible folk.

I appreciate your work in bringing to light email as a costly tools in the process of knowledge sharing and coordination.


Curious but why do you qualify HN as semi social?


Not OP, but personally I definitely view HN as a comment board that points to links as a way to drive discussion as opposed to a news site that allows comments. Probably not how it is normally described, but it's definitely how I use it.


Agreed. That's how I use it too.


There's no real culture here of meta discussion or throwaway comments/in jokes like there is on more socially engaged platforms.

The majority of discussion here is "on topic" - which is good but very different to Reddit or Slack communities which tend to revolve more around personalities.


Ah, well I guess semi is the wrong word here? I guess I mean HN has a group of people in general that have a shared background. I would guess that a large % of people on HN have some sort of STEM background, which doesn’t match to the reddit community. I do not see many people on /r/the_donald with a STEM background arguing against climate change.


Dark patterns are often justified as trying to gain more of a user’s/customer’s attention, as seen to be the case here. IMHO, this is Amazon shrewdly introducing limits to their charity.

Whether it be a functional constraint or a dark pattern, user’s attention is being exchanged for agency of charity on behalf of the consumer.

A quick search on Twitter leads me to believe there is a feedback cycle Amazon leveraged with notifications about the Smile program. Consumers buy, notifications of their charity are pushed, Amazon profits. Repeat cycle. Amazon profits.


If one considers absence of desire to take a certain approach as a constraint, anything can be a legit constraint.


I think we all know the answer to that.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: