scrape-insta
Mengikis data dari Instagram tanpa mendaftar ke API yang diautentikasi. By FRM Developer dari scraper-instagram
Getting started
Prerequisites
- NodeJS
- NPM
- Yarn
Install
DARI npm
masukkan tulisan dibawah ke package.json
"scrape-insta": "latest"
atau ketik di terminal / command prompt
npm install scrape-insta
Variabel lingkungan opsional untuk pengujian yang lebih lengkap:
-
SESSION_ID
: ID sesi untuk uji autentikasi dan uji autentikasi -
PUBLIC_PROFILE
: profil publik untuk diakses -
PRIVATE_PROFILE
: profil pribadi untuk diakses -
STORY_PROFILE_ID
: ID profil dengan cerita untuk dibaca -
STORY_PROFILE_USERNAME
: nama pengguna profil dengan cerita untuk dibaca -
HASHTAG
(default value :cat
) : tagar untuk diambil -
LOCATION_ID
(default value :6889842
aka. Paris) : lokasi untuk diambil -
POST
: sebuah pos untuk diambil -
SEARCH_PROFILE
: profil untuk dicari -
SEARCH_HASHTAG
(default value :cats
) : hashtag untuk mencari -
SEARCH_LOCATION
(default value :Paris
) : lokasi untuk mencari
Metode yang tidak tercakup oleh tes:
subscribeUserPosts
subscribeHashtagPosts
subscribeAccountNotifications
Penggunaan
const Insta = require('scrape-insta');
const InstaClient = new Insta();
Otentikasi
Otentikasi memungkinkan Anda untuk mengakses profil pribadi selama Anda mengikuti mereka.
Mengimpor ID sesi Anda dengan desktop
- Buka instagram.com
- Login (jika belum login)
- Open development tools (
Ctrl
+Shift
+I
) - Ambil
sessionid
cookie value- Untuk browser berbasis chromium :
application
tab - Untuk browser berbasis firefox :
storage
tab
- Untuk browser berbasis chromium :
Mengimpor ID sesi Anda dengan Android
- Buka instagram.com di browser
- Login (jika belum login)
- Buka aplikasi HTTP Canary dan aktifkan pemantauan
- Buka browser dan kunjungi instagram.com
- Buka HTTP Canary dan hentikan pemantauan
- Cari instagram.com di http canary dan cari cookie
Code
InstaClient.authBySessionId(yourSessionId)
.then(account => console.log(account))
.catch(err => console.error(err));
Jika otentikasi berhasil, Anda akan mendapatkan data formulir dari accounts/edit
:
{
"first_name": "",
"last_name": "",
"email": "",
"is_email_confirmed": true,
"is_phone_confirmed": true,
"username": "",
"phone_number": "",
"gender": 1,
"birthday": null,
"biography": "",
"external_url": "",
"chaining_enabled": true,
"presence_disabled": false,
"business_account": false,
"usertag_review_enabled": false
}
Jika ID sesi Anda tidak valid, Anda akan mendapatkan 401
error.
Otentikasi nama pengguna/kata sandi mungkin didukung di masa mendatang.
Get
Metode ini memungkinkan Anda untuk mendapatkan elemen tertentu dari Instagram sementara Anda tahu persis apa yang Anda cari.
Errors handling
get
dapat mengembalikan kesalahan dalam dua kasus berikut.
- Request error : failed to get data from Instagram (HTTP code)
- Parsing error : failed to parse data returned by Instagram (
406
) - No content : nothing to parse (
204
) - Authentication required : session ID required to access data (
401
) - Too many requests : rate limit exceeded (
429
) - Conflict : automation detected, password reset required (
409
)
Dapatkan profil berdasarkan nama pengguna
InstaClient.getProfile(username)
.then(profile => console.log(profile))
.catch(err => console.error(err));
Result
-
id
string - Instagram identifier, only used for stories -
name
string - public full name -
pic
url - public profile picture -
bio
string - public biographywebsite
url - public website
more info about bio & website -
private
boolean - account private state -
access
boolean - access to the profile's feed
In order to have access to a private account's feed, you must have sent him a follow request that he accepted. -
verified
boolean - account verified state -
followers
integer - number of users following this profile -
following
integer - number of users this profile follows -
posts
integer - number of posts this profile published -
lastPosts
array of posts - last posts
This property is empty ([]
) when the profile doesn't have any post butnull
ifaccess
isfalse
(denied). -
link
url - link to the profile's page -
business
string - business category (when applicable and profile unblocked) -
user
object - user relevant properties (while authenticated) :-
mutualFollowers
array of usernames - people following you and this profile -
blocking
boolean - you blocked this profile -
blocked
boolean - this profile blocked you (only available property inuser
whiletrue
) -
requesting
boolean - you sent a follow request to this profile (if private) -
requested
boolean - this profile sent you a follow request (if yours is private) -
following
boolean - you're following this profile -
followed
boolean - this profile follows you
-
Get profile story (requires authentication)
Using profile ID
InstaClient.getProfileStoryById(id)
.then(profile => console.log(profile))
.catch(err => console.error(err));
Using profile username (will automatically request profile ID)
InstaClient.getProfileStory(username)
.then(profile => console.log(profile))
.catch(err => console.error(err));
Result
-
unread
boolean - profile story is unread -
author
object - a subset of profileusername
pic
-
user
object - user relevant propertiesrequesting
following
-
items
array of stories - profile stories-
url
string - link to original story file (jpg
,mp4
, ...) -
type
string - story type :photo
orvideo
-
timestamp
epoch -
expirationTimestamp
epoch
-
Those methods will return null
when a profile has no story.
Note : calling this method will not mark the story as read.
Get hashtag
InstaClient.getHashtag(hashtag)
.then(hashtag => console.log(hashtag))
.catch(err => console.error(err));
Result
-
pic
url - hashtag profile pic (can't find out how it is chosen) -
posts
integer - number of posts containing this hashtag -
featuredPosts
array of posts - featured posts published with this hashtaglastPosts
array of posts - last posts published with this hashtag
more info about hashtag posts -
link
url - link to the hashtag's page -
user
object - user relevant properties (while authenticated) :-
following
boolean - you subscribed to this hashtag (receiving posts in your personal feed)
-
Get location by ID
Unfortunately, using IDs is currently the only way to get a location, at least for now.
InstaClient.getLocation(id)
.then(location => console.log(location))
.catch(err => console.error(err));
Result
-
pic
url - location profile pic -
posts
integer - posts published from that location -
address
object-
street
string -
zipCode
string -
city
string -
latitude
float -
longitude
float
-
-
website
url - place's website -
phone
string - place's contact phone number -
featuredPosts
array of posts - featured posts published from this locationlastPosts
array of posts - last posts published from this location -
link
url - link to this location's page
Array of posts
This is a subset of a real post, containing the following properties :
-
shortcode
string - post identifier -
caption
string - post description -
comments
integer - number of comments -
likes
integer - number of likes -
thumbnail
url - post thumbnail
Always static image wether it's a photo or a video post, lower quality.
Get post by shortcode
The shortcode is the post's identifier : the link to a post is instagram.com/p/shortcode.
InstaClient.getPost(shortcode)
.then(post => console.log(post))
.catch(err => console.error(err));
Result
-
author
object - a subset of a profile's properties.-
username
string -
name
string -
pic
url -
verified
boolean -
link
url
-
-
location
-
name
string -
city
string
-
-
contents
array of posts-
type
string - post type :photo
orvideo
-
url
string - link to original post file (jpg
,mp4
, ...) - if
type
isvideo
:thumbnail
string - link to thumbnailviews
integer - number of views
-
-
tagged
array of usernames - people tagged in post contents -
likes
integer - number of likes -
caption
string - post description -
hashtags
array of hashtags - hashtags mentioned in post description -
mentions
array of usernames - people mentioned in post description -
edited
boolean - caption edited -
comments
array of objects (Max 40)-
user
string - comment author's username -
content
string - comment content -
timestamp
epoch -
hashtags
array of hashtags -
mentions
array of usernames -
likes
integer
-
-
commentCount
integer -
timestamp
epoch -
link
string - link to the post
Paginated getters (require authentication)
Paginated getters allows bulk data downloads.
Params :
-
maxCount
integer - max number of items to return -
pageId
string (optional) - page navigation identifier
Result : array + nextPageId
property
Sample :
(async () => {
const page0 = await somePaginatedGetter(someId, 50);
const page1 = await somePaginatedGetter(someId, 50, page0.nextPageId);
const page2 = await somePaginatedGetter(someId, 50, page1.nextPageId);
})();
The pageId
/nextPageId
property may contain a string of digits, a base64 string, or a JSON string, but always must be leaved untouched.
Get profile posts
Result in array : full post object
Using profile ID
InstaClient.getProfilePostsById(profileId, maxCount, pageId)
.then(posts => console.log(posts))
.catch(err => console.error(err));
Using profile username (will automatically request profile ID)
InstaClient.getProfilePosts(profileUsername, maxCount, pageId)
.then(posts => console.log(posts))
.catch(err => console.error(err));
Get post comments
InstaClient.getPostComments(shortcode, maxCount, pageId)
.then(posts => console.log(posts))
.catch(err => console.error(err));
Result in array : comment object
Get hashtag posts
InstaClient.getHashtagPosts(hashtag, maxCount, pageId)
.then(posts => console.log(posts))
.catch(err => console.error(err));
Result in array : partial post object
Get location posts
InstaClient.getLocationPostsById(locationId, maxCount, pageId)
.then(posts => console.log(posts))
.catch(err => console.error(err));
Result in array : partial post object
Search
Search profile
InstaClient.searchProfile(query)
.then(profiles => console.log(profiles))
.catch(err => console.error(err));
Result in array : a subset of profile.
username
name
pic
private
verified
followers
-
user
following
Search hashtag
InstaClient.searchHashtag(hashtag)
.then(hashtags => console.log(hashtags))
.catch(err => console.error(err));
Result in array : a subset of hashtag.
name
posts
Search location
InstaClient.searchLocation(location)
.then(locations => console.log(locations))
.catch(err => console.error(err));
Result in array : a subset of location.
id
name
-
address
street
city
latitude
longitude
Subscribe to posts
-
options
object (optional)-
interval
integer (optional) - time in seconds between requests. Default : 30 -
lastPostShortcode
string (optional) - shortcode from which to begin if not the next one to be published. -
fullPosts
boolean (optional) - fetch full post data, additional request required
-
From user
InstaClient.subscribeUserPosts(username, (post, err) => {
if(post)
console.log(post.shortcode);
else
console.error(err);
}, {
interval,
lastPostShortcode,
fullPosts
});
From hashtag
InstaClient.subscribeHashtagPosts(hashtag, (post, err) => {
if(post)
console.log(post.shortcode);
else
console.error(err);
}, {
interval,
lastPostShortcode,
fullPosts
});
Account requests (user-relevant methods)
Get account notifications
InstaClient.getAccountNotifications()
.then(notifications => console.log(notifications))
.catch(err => console.error(err));
Result in array : notification
-
id
string - Notification identifier -
timestamp
epoch -
type
string - Notification type :like
,mention
,comment
,follow
-
post
shortcode
thumbnail
-
by
username
name
pic
-
content
string - Comment content (when applicable)
Subscribe to account notifications
-
options
object (optional)-
interval
integer (optional) - time in seconds between requests. Default : 30 -
lastNotificationId
string (optional) - Notification ID
-
InstaClient.subscribeAccountNotifications((post, err) => {
if(post)
console.log(post.shortcode);
else
console.error(err);
}, {
interval,
lastNotificationId
});
Get account stories
InstaClient.getAccountStories()
.then(stories => console.log(stories))
.catch(err => console.error(err));
Result in array : inbox-like
unread
-
author
object - a subset of a profile's properties.id
username
pic
-
user
object - user relevant propertiesrequesting
following
Changelog
-
1.0.0
(2019-03-26) • Initial release -
1.0.1
(2019-03-27)- Fixed throw error scope
- Fixed single photo post wrongly structured
- Added support for comments
- Added support for hashtags, mentions and tags in posts and comments
- Added posts subscriptions feature from users (untested) and hashtags
-
1.0.2
(2019-03-27) • Added support for videos -
1.0.4
(2019-03-27)- Fixed video post thumbnail & views count
- Using promises & observable
-
1.0.5
(2019-03-27) • Added proper error for private accounts -
1.0.6
(2019-03-31) • Private account access doesn't require mutual follow -
1.0.7
(2019-04-03) • Added profile's last posts analytics #1 + more -
1.0.8
(2019-04-14)- Using classes
- Added support for authentication using session cookie (only allows to access friend profile)
- Added support for locations
- Added search feature for profiles, hashtags & locations
- Added user-relevant properties
- Added support for notifications history & subscription
- Fixed subscriptions since #1
- Removed useless
id
properties
-
1.0.9
- Added
business
property to profile (when applicable) - Automatically access public profile anonymously when user blocked
- Added
-
1.0.10
(2020-01-26) • Fixed post comments on anonymous session -
1.0.11
(2020-04-18)- Improved subscriptions
- Using async/await
- Using simple callbacks instead of observables
- Using object parameter for options
- Added full post fetching option
- Added subscription unsubscribe method
- Improved
401
detection - Improved parsing
- Using RegExp
- Removed JSDOM dependency
- Added support for edited post captions
- Improved subscriptions
-
1.0.12
(2020-06-16) • Small fix & refactor -
1.0.13
(2020-07-10) • Added support for stories -
1.0.14
(2020-10-17) • Fixed access to own private profile -
1.0.15
(2020-12-19)- Removed Request dependency
- Improved
429
detection - Added unit tests
-
1.0.16
(2020-12-25)- Added full post
commentCount
property - Added partial post
timestamp
property - Added post comment
id
property - Added profile IDs memoization
- Added
getProfilePostsById
&getProfilePosts
methods - Added
getPostComments
method - Added
getHashtagPosts
method - Added
getLocationPostsById
method - Added
409
detection - Improved
401
detection - Restored full post
shortcode
property
- Added full post
-
1.0.17
(2021-01-21)- Fixed error on empty profile story
- Fixed
409
detection - Added profile story external link
-
2.0.0
(202?-??-??)- Refactored names
- Refactored scopes
- Refactored promises
- Refactored errors
- Refactored indents
- Renamed
getLocation
togetLocationById
- Reverse subscribe methods
post
&error
parameters - Improved unit tests coverage
- Added JSDoc
- Added links to partial post, hashtag & location
- Fixed post author link