What is the best approach for creating an agent framework in python for flexible script distribution and data collection

Go To StackoverFlow.com

0

What I am trying to do: I have hundreds of servers with very large log files spread out at dozens of different clients. I am creating nice python scripts to parse the logs in different ways and would like to aggregate the data I am collecting from all of the different servers. I would also like to keep the changing scripts centralized. The idea is to have a harness that can connect to each of the servers, scp the script to the servers, run the process with pexpect or something similar and either scp the resulting data back in separate files to be aggregated or (preferentially, I think) stream the data and aggregate it on the fly. I do not have keys set up (nor do I want to set them up) but I do have a database with connection information, logins, passwords and the like.

My question: this seems like it is probably a solved problem and I am wondering if someone knows of something that does this kind of thing or if there is a solid and proven way of doing this...

2009-06-16 14:52
by Ichorus


3

Looks like hadoop is your answer http://www.michael-noll.com/wiki/Writing_An_Hadoop_MapReduce_Program_In_Python

or Pyro is also good but I am not sure if you can automatically distribute scripts. http://pyro.sourceforge.net/features.html

2009-06-16 15:19
by Anurag Uniyal
Wow. Pyro looks powerful, I will definitely be digging more into that. Hadoop looks a bit heavyweight for what I am trying to accomplish. Thanks - Ichorus 2009-06-16 16:11


1

Parallel Python provides some functionality for distributed computing and communication:

http://www.parallelpython.com/

2009-06-16 14:56
by Dan Lorenc
These are not exactly clustered computers...I think the task I am trying to accomplish is much simpler than what parallel python is designed for - Ichorus 2009-06-16 15:40


1

Take a look at Func. It's a framework for rpc-style communication with a large number of machines using python. As a bonus, it comes with built-in TLS, so you don't need to layer on top of ssh tunneling for security.

2009-06-16 15:06
by JimB


0

at least one part of your job - scripts distribution, could be done by sparrow - scripts distribution system.

Thus You may write your script on many languages and Python too! Sparrow treats scripts as packages of software with versions, ownership and documentation, the same way as you would install packages via deb or rpm.

Sparrow provides neat way to develop and manage various scripts in centralized manner.

PS. Disclaimer - I am the tool author

2017-02-19 12:56
by Alexey Melezhik
Ads