It would appear that the JSON API returns very different results than the browser.
Put this URL in your browser and look at the results, then try it with API Kitchen, Curl, Mechanize, etc
You get 100 results with the browser. Using the non-browser methods of retrieving it gets you 1-2 results.
Is this a bug, or intentional design to limit what web crawlers gather from Reddit? On larger subreddits, it makes for incredibly inconsistent results, and the "after" parameter is inaccurate then for paging, resulting in a ton of duplicate results.
Yet, I can't find any documentation indicating that this is intentional and not a bug. If there are limits, that's cool, I just want to know what they are so I can respect them properly in my code. It turns out that the problem was that authenticated and unauthenticated requests will get different returns. If you authenticate, then everything will return 100%.
以上就是Does Reddit's JSON API have undocumented artificial limits to prevent scraping?的详细内容，更多请关注web前端其它相关文章！