<tfoot id='56Nqd'></tfoot>
    <i id='56Nqd'><tr id='56Nqd'><dt id='56Nqd'><q id='56Nqd'><span id='56Nqd'><b id='56Nqd'><form id='56Nqd'><ins id='56Nqd'></ins><ul id='56Nqd'></ul><sub id='56Nqd'></sub></form><legend id='56Nqd'></legend><bdo id='56Nqd'><pre id='56Nqd'><center id='56Nqd'></center></pre></bdo></b><th id='56Nqd'></th></span></q></dt></tr></i><div id='56Nqd'><tfoot id='56Nqd'></tfoot><dl id='56Nqd'><fieldset id='56Nqd'></fieldset></dl></div>

    <small id='56Nqd'></small><noframes id='56Nqd'>

    • <bdo id='56Nqd'></bdo><ul id='56Nqd'></ul>

    <legend id='56Nqd'><style id='56Nqd'><dir id='56Nqd'><q id='56Nqd'></q></dir></style></legend>
      1. 从 Twitter 抓取用户位置

        时间:2023-09-10

              <tbody id='3DF6Y'></tbody>
            <legend id='3DF6Y'><style id='3DF6Y'><dir id='3DF6Y'><q id='3DF6Y'></q></dir></style></legend><tfoot id='3DF6Y'></tfoot>

            • <bdo id='3DF6Y'></bdo><ul id='3DF6Y'></ul>

                <small id='3DF6Y'></small><noframes id='3DF6Y'>

                  <i id='3DF6Y'><tr id='3DF6Y'><dt id='3DF6Y'><q id='3DF6Y'><span id='3DF6Y'><b id='3DF6Y'><form id='3DF6Y'><ins id='3DF6Y'></ins><ul id='3DF6Y'></ul><sub id='3DF6Y'></sub></form><legend id='3DF6Y'></legend><bdo id='3DF6Y'><pre id='3DF6Y'><center id='3DF6Y'></center></pre></bdo></b><th id='3DF6Y'></th></span></q></dt></tr></i><div id='3DF6Y'><tfoot id='3DF6Y'></tfoot><dl id='3DF6Y'><fieldset id='3DF6Y'></fieldset></dl></div>

                • 本文介绍了从 Twitter 抓取用户位置的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

                  问题描述

                  我正在尝试从 Twitter 中获取用户名的纬度和经度.用户名列表是一个 csv 文件,在一个输入文件中包含 50 多个名称.以下是我迄今为止所做的两个试验.他们似乎都没有工作.欢迎对任何程序或全新方法进行更正.

                  I am trying to scrape latitude and longitude of user from Twitter with respect to user names. The user name list is a csv file with more than 50 names in one input file. The below are two trials that I have made by far. Neither of them seems to be working. Corrections in any one of the program or an entirely new approach is welcome.

                  我有 User_names 列表,我正在尝试查找用户个人资料并从个人资料或时间线中提取 geolocation.我在互联网上的任何地方都找不到很多样本.

                  I have list of User_names and I am trying to lookup user profile and pull the geolocation from the profile or timeline. I could not find much of samples anywhere over Internet.

                  我正在寻找一种更好的方法来从 Twitter 获取用户的地理位置.我什至找不到显示参考 User_name 或 user_id 获取用户位置的单个示例.它甚至可能放在首位吗?

                  I am looking for a better approach to get geolocations of users from Twitter. I could not even find a single example that shows harvesting User location with reference to User_name or user_id. Is It even possible in first place?

                  输入:输入文件有超过 50k 行

                  Input: The input files have more than 50k rows

                  AfsarTamannaah,6.80E+17,12/24/2015,#chennaifloods
                  DEEPU_S_GIRI,6.80E+17,12/24/2015,#chennaifloods
                  DEEPU_S_GIRI,6.80E+17,12/24/2015,#weneverletyoudownstr
                  ndtv,6.80E+17,12/24/2015,#chennaifloods
                  1andonlyharsha,6.79E+17,12/21/2015,#chennaifloods
                  Shashkya,6.79E+17,12/21/2015,#moneyonmobile
                  Shashkya,6.79E+17,12/21/2015,#chennaifloods
                  timesofindia,6.79E+17,12/20/2015,#chennaifloods
                  ANI_news,6.78E+17,12/20/2015,#chennaifloods
                  DrAnbumaniPMK,6.78E+17,12/19/2015,#chennaifloods
                  timesofindia,6.78E+17,12/18/2015,#chennaifloods
                  SRKCHENNAIFC,6.78E+17,12/18/2015,#dilwalefdfs
                  SRKCHENNAIFC,6.78E+17,12/18/2015,#chennaifloods
                  AmeriCares,6.77E+17,12/16/2015,#india
                  AmeriCares,6.77E+17,12/16/2015,#chennaifloods
                  ChennaiRainsH,6.77E+17,12/15/2015,#chennairainshelp
                  ChennaiRainsH,6.77E+17,12/15/2015,#chennaifloods
                  AkkiPritam,6.77E+17,12/15/2015,#chennaifloods
                  

                  代码:

                  import tweepy
                  from tweepy import Stream
                  from tweepy.streaming import StreamListener
                  from tweepy import OAuthHandler
                  import pandas as pd
                  import json
                  import csv
                  import sys
                  import time
                  
                  CONSUMER_KEY = 'XYZ'
                  CONSUMER_SECRET = 'XYZ'
                  ACCESS_KEY = 'XYZ'
                  ACCESS_SECRET = 'XYZ'
                  
                  auth = OAuthHandler(CONSUMER_KEY,CONSUMER_SECRET)
                  api = tweepy.API(auth)
                  auth.set_access_token(ACCESS_KEY, ACCESS_SECRET)
                  
                  data = pd.read_csv('user_keyword.csv')
                  
                  df = ['user_name', 'user_id', 'date', 'keyword']
                  
                  test = api.lookup_users(user_ids=['user_name'])
                  
                  for user in test:
                      print user.user_name
                      print user.user_id
                      print user.date
                      print user.keyword
                      print user.geolocation
                  

                  错误:

                  Traceback (most recent call last):
                    File "user_profile_location.py", line 24, in <module>
                      test = api.lookup_users(user_ids=['user_name'])
                    File "/usr/lib/python2.7/dist-packages/tweepy/api.py", line 150, in lookup_users
                      return self._lookup_users(list_to_csv(user_ids), list_to_csv(screen_names))
                    File "/usr/lib/python2.7/dist-packages/tweepy/binder.py", line 197, in _call
                      return method.execute()
                    File "/usr/lib/python2.7/dist-packages/tweepy/binder.py", line 173, in execute
                      raise TweepError(error_msg, resp)
                  tweepy.error.TweepError: [{'message': 'No user matches for specified terms.', 'code': 17}]
                  

                  我知道每个用户都不会分享地理位置,但如果我能获得地理位置,那些保持个人资料公开开放的人会很棒.

                  I understand every user does not share the geolocation, but those who keep the profile publicly open from the if I can get geolocation shall be great.

                  我正在寻找作为名称和/或纬度的用户位置.

                  User locations as name and/or lat lon is what I am looking for.

                  如果这种方法不正确,那么我也愿意接受替代方案.

                  If this approach isn't correct then I am open to alternatives also.

                  更新一:经过一番深入搜索后,我发现了这个 网站,它提供了一个非常关闭解决方案,但是在尝试从输入文件中读取 userName 时出现错误.

                  Update One: After some deep search I found this website that provides a very close solution, But I am getting error while trying to read the userName from the input file.

                  这表示只能抓取 100 个用户的信息,有什么更好的方法可以解除这个限制?

                  This says only 100 user's information can be grabbed what is the better way to lift that limitation ?

                  代码:

                  import sys
                  import string
                  import simplejson
                  from twython import Twython
                  import csv
                  import pandas as pd
                  
                  #WE WILL USE THE VARIABLES DAY, MONTH, AND YEAR FOR OUR OUTPUT FILE NAME
                  import datetime
                  now = datetime.datetime.now()
                  day=int(now.day)
                  month=int(now.month)
                  year=int(now.year)
                  
                  
                  #FOR OAUTH AUTHENTICATION -- NEEDED TO ACCESS THE TWITTER API
                  t = Twython(app_key='ABC', 
                      app_secret='ABC',
                      oauth_token='ABC',
                      oauth_token_secret='ABC')
                  
                  #INPUT HAS NO HEADER NO INDEX
                  ids = pd.read_csv('user_keyword.csv', header=['userName', 'userID', 'Date', 'Keyword'], usecols=['userName'])
                  
                  #ACCESS THE LOOKUP_USER METHOD OF THE TWITTER API -- GRAB INFO ON UP TO 100 IDS WITH EACH API CALL
                  
                  users = t.lookup_user(user_id = ids)
                  
                  #NAME OUR OUTPUT FILE - %i WILL BE REPLACED BY CURRENT MONTH, DAY, AND YEAR
                  outfn = "twitter_user_data_%i.%i.%i.csv" % (now.month, now.day, now.year)
                  
                  #NAMES FOR HEADER ROW IN OUTPUT FILE
                  fields = "id, screen_name, name, created_at, url, followers_count, friends_count, statuses_count, 
                      favourites_count, listed_count, 
                      contributors_enabled, description, protected, location, lang, expanded_url".split()
                  
                  #INITIALIZE OUTPUT FILE AND WRITE HEADER ROW   
                  outfp = open(outfn, "w")
                  outfp.write(string.join(fields, "	") + "
                  ")  # header
                  
                  #THE VARIABLE 'USERS' CONTAINS INFORMATION OF THE 32 TWITTER USER IDS LISTED ABOVE
                  #THIS BLOCK WILL LOOP OVER EACH OF THESE IDS, CREATE VARIABLES, AND OUTPUT TO FILE
                  for entry in users:
                      #CREATE EMPTY DICTIONARY
                      r = {}
                      for f in fields:
                          r[f] = ""
                      #ASSIGN VALUE OF 'ID' FIELD IN JSON TO 'ID' FIELD IN OUR DICTIONARY
                      r['id'] = entry['id']
                      #SAME WITH 'SCREEN_NAME' HERE, AND FOR REST OF THE VARIABLES
                      r['screen_name'] = entry['screen_name']
                      r['name'] = entry['name']
                      r['created_at'] = entry['created_at']
                      r['url'] = entry['url']
                      r['followers_count'] = entry['followers_count']
                      r['friends_count'] = entry['friends_count']
                      r['statuses_count'] = entry['statuses_count']
                      r['favourites_count'] = entry['favourites_count']
                      r['listed_count'] = entry['listed_count']
                      r['contributors_enabled'] = entry['contributors_enabled']
                      r['description'] = entry['description']
                      r['protected'] = entry['protected']
                      r['location'] = entry['location']
                      r['lang'] = entry['lang']
                      #NOT EVERY ID WILL HAVE A 'URL' KEY, SO CHECK FOR ITS EXISTENCE WITH IF CLAUSE
                      if 'url' in entry['entities']:
                          r['expanded_url'] = entry['entities']['url']['urls'][0]['expanded_url']
                      else:
                          r['expanded_url'] = ''
                      print r
                      #CREATE EMPTY LIST
                      lst = []
                      #ADD DATA FOR EACH VARIABLE
                      for f in fields:
                          lst.append(unicode(r[f]).replace("/", "/"))
                      #WRITE ROW WITH DATA IN LIST
                      outfp.write(string.join(lst, "	").encode("utf-8") + "
                  ")
                  
                  outfp.close()    
                  

                  错误:

                  File "user_profile_location.py", line 35, in <module>
                      ids = pd.read_csv('user_keyword.csv', header=['userName', 'userID', 'Date', 'Keyword'], usecols=['userName'])
                    File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 562, in parser_f
                      return _read(filepath_or_buffer, kwds)
                    File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 315, in _read
                      parser = TextFileReader(filepath_or_buffer, **kwds)
                    File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 645, in __init__
                      self._make_engine(self.engine)
                    File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 799, in _make_engine
                      self._engine = CParserWrapper(self.f, **self.options)
                    File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 1202, in __init__
                      ParserBase.__init__(self, kwds)
                    File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 918, in __init__
                      raise ValueError("cannot specify usecols when "
                  ValueError: cannot specify usecols when specifying a multi-index header
                  

                  推荐答案

                  假设您只想获取用户在他/她的个人资料页面中的位置,您可以使用 API.get_user 来自 Tweepy.下面是工作代码.

                  Assuming that you just want to get the location of the user that is put up in his/her profile page, you can just use the API.get_user from Tweepy. Below is the working code.

                  #!/usr/bin/env python
                  from __future__ import print_function
                  
                  #Import the necessary methods from tweepy library
                  import tweepy
                  from tweepy import OAuthHandler
                  
                  
                  #user credentials to access Twitter API 
                  access_token = "your access token here"
                  access_token_secret = "your access token secret key here"
                  consumer_key = "your consumer key here"
                  consumer_secret = "your consumer secret key here"
                  
                  
                  def get_user_details(username):
                          userobj = api.get_user(username)
                          return userobj
                  
                  
                  if __name__ == '__main__':
                      #authenticating the app (https://apps.twitter.com/)
                      auth = tweepy.auth.OAuthHandler(consumer_key, consumer_secret)
                      auth.set_access_token(access_token, access_token_secret)
                      api = tweepy.API(auth)
                  
                      #for list of usernames, put them in iterable and call the function
                      username = 'thinkgeek'
                      userOBJ = get_user_details(username)
                      print(userOBJ.location)
                  

                  注意:这是一个粗略的实现.编写适当的 sleeper 函数以遵守 Twitter API 访问限制.

                  Note: This is a crude implementation. Write a proper sleeper function to obey Twitter API access limits.

                  这篇关于从 Twitter 抓取用户位置的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持html5模板网!

                  上一篇:使用 Python 计算多多边形 shapefile 中的点数 下一篇:解析用户名以提取用户位置 Twitter

                  相关文章

                  最新文章

                    • <bdo id='shYdB'></bdo><ul id='shYdB'></ul>
                  1. <small id='shYdB'></small><noframes id='shYdB'>

                    <i id='shYdB'><tr id='shYdB'><dt id='shYdB'><q id='shYdB'><span id='shYdB'><b id='shYdB'><form id='shYdB'><ins id='shYdB'></ins><ul id='shYdB'></ul><sub id='shYdB'></sub></form><legend id='shYdB'></legend><bdo id='shYdB'><pre id='shYdB'><center id='shYdB'></center></pre></bdo></b><th id='shYdB'></th></span></q></dt></tr></i><div id='shYdB'><tfoot id='shYdB'></tfoot><dl id='shYdB'><fieldset id='shYdB'></fieldset></dl></div>

                    <legend id='shYdB'><style id='shYdB'><dir id='shYdB'><q id='shYdB'></q></dir></style></legend>
                  2. <tfoot id='shYdB'></tfoot>