In 2022, the official Instagram API allows you to access only your posts and not even public comments and posts on Instagram because of the rising privacy concerns from users and frequent accusations of data breaches at many big companies including Facebook. This has made it difficult for programmers to crawl Instagram data.
So, how to crawl Instagram data?
There’s still a workaround. It does provide an API that is publicly accessible.
Let’s try to hit this URL.
tl;dr
This API doesn’t work anymore. You need to undergo App Review and request approval for:
– the Instagram Public Content Access feature
– the instagram_basic permission
You can read more here: Hashtag Search
Eureka, it’s a JSON response:
URL & JSON response
URL: https://www.instagram.com/explore/tags/travel/?a=1
Here, travel is the hashtag, as we can also see in the JSON response. And JSON response consists of all the posts containing hashtag travel. Now JSON response is easy to understand. Edges is the list that contains posts’ data. So, now all we need is to parse this JSON to get the data.
Programmatically parsing response using Python
Libraries required: requests
Here’s a quick Python code to get the captions from the posts, you can modify it for your own use:
import requests
class Parser:
HASH_KEY = "graphql"
HASHTAG_KEY = "hashtag"
MEDIA_KEY = "edge_hashtag_to_media"
LIST_KEY = "edges"
NODE_KEY = "node"
CAPTION_LIST_KEY = "edge_media_to_caption"
TEXT_KEY = "text"
def __init__(self, tag):
self.tag = tag
def get_url(self):
url = "https://www.instagram.com/explore/tags/" +
self.tag + "/?__a=1"
return url
def get_request_response(self):
r = requests.get(url=self.get_url(), params="")
data = r.json()
return data
def get_captions(self):
captions = []
data = self.get_request_response()
nodes_list = data[Parser.HASH_KEY][Parser.HASHTAG_KEY][Parser.MEDIA_KEY][Parser.LIST_KEY]
for obj in nodes_list:
caption_list = obj[Parser.NODE_KEY][Parser.CAPTION_LIST_KEY][Parser.LIST_KEY]
if len(caption_list) > 0:
caption = caption_list[0][Parser.NODE_KEY][Parser.TEXT_KEY]
captions.append(caption)
print(caption)
def main():
parser = Parser("travel")
parser.get_captions()
if __name__ == "__main__":
main()
Later, we would be posting more programming tutorials. If you like our posts, please like, comment, and share.