Computer Nerds - help

Are there any computer nerds in here? I'm looking for someone good with pulling data from web pages for a specific task - write me a script to pull a user list of our group. The only way to see all the users is one page at a time. Eventually, in order to not bother people too much, I need to compare potential recruits with our user list. I have some ideas on how I can do it, but it would likely take me a few hours to figure out a script that would work. If anyone has the necessary skills to whip something up in a few minutes instead, that would be great. (I'd prefer something that could run off a *nix box instead of Windows also - unless you could schedule the script to periodically run and email me results or something)
Thanks!

Replies

  • jobryan4
    jobryan4 Posts: 138 Member
    I wondered how you were keeping track of who you invited and if I had invited the same people. I have no clue how to help but good luck!
  • Coolhand1969
    Coolhand1969 Posts: 821 Member
    I wondered how you were keeping track of who you invited and if I had invited the same people. I have no clue how to help but good luck!

    I've mostly been relying on my browser cache so far (if I put the name in the invite field and it matches, it will pop up the name to automatically fill it in). But it's not foolproof, and eventually will by mostly useless). I also have a pretty good memory for the pictures (and only invite people with pictures - figuring they're the ones using the website and not just the app).
  • kathyazach
    kathyazach Posts: 22 Member
    I love it when you talk geek
  • Coolhand1969
    Coolhand1969 Posts: 821 Member
    I love it when you talk geek

    I know.... ;)
  • debramariemary
    debramariemary Posts: 13 Member
    Well don't ask the administrators on this site. These message boards are the most confusing I've ever seen. The most recent post should be the one we see when we go into a topic. And we should eaisly be able to comment on someone else's post. I just did that and the comment is at the end of the topic even though I hit reply. Looks like I threw a random thought out there and makes no sense. Hmmm.... Maybe if you find a good nerd he/she could fix the entire site.
  • spmane
    spmane Posts: 3 Member
    There are two ways of doing that.
    1. MFP provides API (interface) to programmatically access and post to the website. I just applied for a developer account.
    2. We can write a script that interacts with the website and download the html and parse out the required info.
    I can help you with this as long as you don't hold me to deadlines.
  • Coolhand1969
    Coolhand1969 Posts: 821 Member
    Well don't ask the administrators on this site. These message boards are the most confusing I've ever seen. The most recent post should be the one we see when we go into a topic. And we should eaisly be able to comment on someone else's post. I just did that and the comment is at the end of the topic even though I hit reply. Looks like I threw a random thought out there and makes no sense. Hmmm.... Maybe if you find a good nerd he/she could fix the entire site.

    I normally use quote instead of reply when repling to a particular message..
  • Coolhand1969
    Coolhand1969 Posts: 821 Member
    There are two ways of doing that.
    1. MFP provides API (interface) to programmatically access and post to the website. I just applied for a developer account.
    2. We can write a script that interacts with the website and download the html and parse out the required info.
    I can help you with this as long as you don't hold me to deadlines.

    spmane - thanks for the offer! I started playing around with "curl" last night in an attempt on #2 with some limited success, but not sure if I'm going to get it in a reasonable time or not. I'll update here if I figure anything out. Thanks!
  • Coolhand1969
    Coolhand1969 Posts: 821 Member
    eh, I've switched to trying to do it with wget, but the login form is thus far defeating me... (same obstacle with curl though, and I find wget easier personally)
  • spmane
    spmane Posts: 3 Member
    If you know scripting perl or python, try WWW::Mechanize (Mechanize in python)? I might have some time tomorrow. I can fix some quick code for you.
  • Coolhand1969
    Coolhand1969 Posts: 821 Member
    If you know scripting perl or python, try WWW::Mechanize (Mechanize in python)? I might have some time tomorrow. I can fix some quick code for you.

    I only know small smatterings of perl,python,php... not fluent by any means... I think I've gotten past the hard part though with wget... I've successfully managed to get past the login form and pull a page of the member list, so from here, I should be able to loop,grep, awk, and sed my way through.... :) dunno how much time I'll have the rest of the day to look at it though...
  • Coolhand1969
    Coolhand1969 Posts: 821 Member
    The following pulls out the usernames from the page just fine:

    grep username list.txt | awk -F\" '{print $3,$4}' |awk -F\= '{print $NF}' |awk 'NF>0'|grep -v group-invite


    I can probably get it slimmer and more efficient, but it seems to work..
  • Coolhand1969
    Coolhand1969 Posts: 821 Member
    below is the shell script I have so far. I decided to grab last login day also so at some point it will make it easier to delete inactive users... Obviously I need to throw in sove variables and looping and get text processing better straightened out. Now that I'm into it, I'm also wondering if an automagic invite system could be setup... do a search on Indy users, check if they have a pic set, compare username to current username list as well as a list of previous invites, and send an invite if they have not been invited before.... hmmmm.... very interesting ;)


    # Log in to the server. This can be done only once.

    wget --keep-session-cookies --save-cookies cookies.txt \
    --post-data 'authenticity_token=1l6D%2Fx8XuXc35fmANFpU3NVhPSC8OO%2BomVqs7Kl9VbA%3D&username=xxxxxxxx&password=xxxxxxxxxx&remember_me=1' \
    http://www.myfitnesspal.com/account/login


    # Now grab the page or pages we care about.
    wget --load-cookies cookies.txt \
    -p http://www.myfitnesspal.com/groups/members/364-indianapolis-mfp-ers?page=4

    #now start processing the file
    #this will grep all users and also get their last login days
    grep -A 2 class=\"user\" www.myfitnesspal.com/groups/members/364-indianapolis-mfp-ers\?page\=4 |awk 'NF>0'|grep -v "\-\-" > memberlist.txt
  • Coolhand1969
    Coolhand1969 Posts: 821 Member
    .....and I'm really really seriously done working on it for awhile now.... ;) hehe
  • spmane
    spmane Posts: 3 Member
    Okay I am done writing the script. I can send you the member list and/or the script privately.

    Send me a quick mail at my MFP username at gmail.

    I do not use wget or curl beyond simple non interactive downloads.
  • Coolhand1969
    Coolhand1969 Posts: 821 Member
    I emailed you...
    I finished the basic group username download functionality with my bash/wget script.... but like I said in the email, I'm sure yours is more professional.... thanks for looking at it.