Thread.Join(int) not killing thread after specified timeout in C#

Go To StackoverFlow.com

3

In my windows form application, I'm trying to test the user's ability to access a remote machine's shared folder. The way I'm doing this (and I'm sure that there are better ways...but I don't know of them) is to check for the existence of a specific directory on the remote machine (I'm doing this because of firewall/other security restrictions that I'm confronted with in my organization). If the user has rights to access the shared folder, then it returns in no time at all, but if they don't, it hangs forever. To solve this, I threw the check into another thread and wait only 1000 milliseconds before determining that the share can't be hit by the user. However, when I do this, it still hangs as if it was never run in the same thread.

What is making it hang and how do I fix it? I would think that the fact that it is in a separate thread would allow me to just let the thread finish on it's own in the background.

Here is my code:

bool canHitPath = false;
Thread thread = new Thread(new ThreadStart(() =>
{
    canHitPath = Directory.Exists(compInfo.Path);
}));
thread.Start();
thread.Join(1000);

if (canHitPath == false)
{
    throw new Exception("Cannot hit folder: " + compInfo.Path);
}

Edit: I feel that I should add that the line of throwing the exception IS HIT. I've debugged this and verified it...however, when the exception is thrown, then is when my program hangs. (I might also add that the exception is caught in the calling method and I never get to the catch statement in the debugger.)

Edit: So I found the answer to this question...but I can't select my answer as the answer for 24 hours...so until then...you'll have to go to the following link: https://stackoverflow.com/a/10045338/1203288 Sorry.

2012-04-05 16:52
by bsara
This is an interesting question. Intuitively, the Join(1000) should abort if the actual join cannot be performed within 1 second. I wonder: if the thread that you're trying to join is actually in an uninterruptible IO wait, then can it cause both threads to get stuck in the IO wait - Daniel Pryden 2012-04-05 17:07
Possibly related question (although not quite the same): How To: Prevent Timeout When Inspecting Unavailable Network Share - C# - Daniel Pryden 2012-04-05 17:08
@DanielPryden Yes, that is what happens. I have seen it happens before - Rafael Colucci 2012-04-05 17:17
Another possibly related question: How to join a thread that is hanging on blocking IO? However, that question is specifically about issues with libpthread on Linux - Daniel Pryden 2012-04-05 17:18
@DanielPryden i have already done it - Rafael Colucci 2012-04-05 17:23
@DanielPryden check this out: http://drdobbs.com/parallel/21200128 - Rafael Colucci 2012-04-05 17:29
@DanielPryden See my edit above for more details regarding your comment about both threads getting stuck and Join(1000) not killing the thread - bsara 2012-04-05 17:36
Guys: Join will never kill the thread. Never. It will only wait for some time, nothing more than that. It is on MSDN docs - Rafael Colucci 2012-04-05 17:38
@Brandon: Your edit is a bit confusing. Which line is throwing the exception? Do you mean the Directory.Exists() method, or some line of code within the BCL itself - Daniel Pryden 2012-04-05 17:47


2

I found the real problem:

The property compInfo.Path is doing a check for a directory existence on the remote file system to determine if the remote machine is 64 bit or not. Depending on the results, it returns a different value. I tried commenting out the the check and it executed successfully. This explains why I couldn't get past the throwing of the exception, I call compInfo.Path in the message of the exception.

However, I think we learned a lot from "the real problem":

  1. The code I posted in the question (as is) works perfectly fine.
  2. thread.Join(int) will exit after the specified time interval regardless of the fact that the thread may still be executing code.
  3. The thread being joined CAN be running an IO operation (thus tying up a file/directory) and the desired result will still come about when doing a thread.Join(int).
  4. Using the "step into" button on a debugger will reveal many things...even about your own "solid" code. :)

Thanks everyone for your help, patience, and thoughtful input/insights.

2012-04-06 15:04
by bsara


2

My guess will be that this happens because when you call:

canHitInstallPath = Directory.Exists(compInfo.InstallPath);

The Exists method holds the execution flow of the tread (it is an uninterruptible call). If it hangs for 30 seconds, them your thread will be waiting for 30 seconds until it has a chance to check if Thread.Join(1000) has elapsed.

Note that the Thread.Join() method only blocks the calling thread (usually the application's main thread of execution) until your thread object completes. You can still have other threads executing in the background while waiting for your specific Thread to finish executing.

From:

Thread.Join Method (System.Threading)

Another thing to consider: Only checking if a folder exists tells nothing if the user can read or write files to the folder. Your best option is to try to write or read a file of the folder. This way you can make sure that the user has permissions in that folder.

EDIT

In you case threads only make sense if you can do other things while waiting for the thread to finish. If you cant, them threads are not helping you at all.

EDIT2

A link to support my answer: File Descriptors And Multithreaded Programs

EDIT3

You best option is to create a killer thread. This thread will kill the DirectoryExists thread when it hangs for more that X seconds.

2012-04-05 17:02
by Rafael Colucci
FYI, Brandon states in the question that the Directory.Exists() is taking a long time because it's a remote path. If the network connection has been lost, it can take 30 seconds (or possibly even longer, IIRC) before the operating system aborts the IO operation. (You can reproduce this hang even in Windows Explorer.) What he's trying to do is to isolate that IO wait to another thread so as to detect a failure more quickly - Daniel Pryden 2012-04-05 17:29
@DanielPryden I agree. I edited the answer - Rafael Colucci 2012-04-05 17:30
How would one execute what has been stated in EDIT3? I originally tried doing a sleep instead of a join, then doing an abort on the thread...but I received the same results - bsara 2012-04-05 18:53


2

Your comments make it clear that the exception is actually thrown and caught. So code execution progressed at least past this code and we can't tell from the snippet what it is doing.

You did make one mistake, you forgot to set the thread's IsBackground property to true. Without it, the program can't terminate. Which is one way you might conclude "it's blocking!". If this guess isn't accurate then we'll need to see the main thread's call stack to have an idea what it is doing. Best captured by turning on unmanaged debugging support and enabling the Microsoft Symbol Server so we can see the entire call stack, not just the managed parts of it.

A completely different approach is to use the Ping class to probe the server.

2012-04-05 17:55
by Hans Passant
I just tried the IsBackground thing. I'm getting the same results...and too clarify: I DON'T ever get to the catch of my exception being thrown and I don't want my program to terminate when the exception is thrown. Also, I actually perform a Ping just before the code that I listed above. However, ping still works if a computer is turned off...this I've tested. But the testing of a directory existence not only tells me if permissions are correct...but also if the remote machine is turned on or not...and, of course, if the directory actually exists which I need to know as well - bsara 2012-04-05 18:42
I need to see a stack trace to further improve my guesses. Crystal ball had a rough day today - Hans Passant 2012-04-05 18:47


0

Maybe take a look at the Task Parallel Library. The TPL is designed to maximize performance and might handle whatever the issue might be. Though it also might be complete overkill for the situation.

2012-04-05 17:02
by richk
I just tried implementing this using the Task class in place of the Thread class. I'm getting the exact same result. Thanks for the comment though, I totally forgot about using Tasks instead of Threads - bsara 2012-04-05 17:26
@Brandon it does not matter what library you use. This will always happen - Rafael Colucci 2012-04-05 17:31


0

Thread.Join is a blocking call that will hold up the thread it was called from until the thread it was called on exits.

You are basically spinning up a new thread to do background work, and then telling the main thread to wait until it finishes. In effect doing it all synchronously.

2012-04-05 17:03
by Bradley Uffner
Yeah, that's what I thought at first, too. But he's using the overload of Thread.Join() that takes a timeout, so at the very least the behavior is unintuitive - Daniel Pryden 2012-04-05 17:06
Thread.Join(1000) will return after one second no matter what. It will kill the thread after a second no matter what it is doing. It shouldn't be blocking for more than that - I've just copy-pasted this code and swapped Exists with Thread.Sleep(30000) and it blocks for one second only. Not sure what is wrong with OP's situation.. - J... 2012-04-05 17:09
@J...: Seriously, it will kill the thread after the timeout? I don't see any reference for that in the documentation - Daniel Pryden 2012-04-05 17:10
@J...: Also, Sleep is an interruptible wait, while the problem the OP is encountering is with an uninterruptible IO wait. They are not the same thing at all - Daniel Pryden 2012-04-05 17:12
Well, it's definitely not "killing" the thread for me. I did intentionally put that timeout in there as explained in my question - bsara 2012-04-05 17:20
@J Sleep does not hang the flow execution as @Daniel said. That's the difference - Rafael Colucci 2012-04-05 17:21
@J Join will never kill the thread - Rafael Colucci 2012-04-05 17:22
@all - my bad. Learn something every day! I would have thought, at least, that Join would not depend on the thread in question. This almost feels like a bug. Either a thread is done or it is not. There should be no reason that Join cannot count 1000ms all by itself and determine whether or not the thread has finished by then. Can anyone say why Join cannot query the state of the thread, or why the thread itself has to participate in message passing, when we're talking about a boolean (atomic) value - J... 2012-04-05 20:04
I mean, consider a class which inherits Thread and has a public boolean property IsDone, set true when the thread completes (until disposed). Surely the thread does not need to be free for another thread to read that memory location. What value is there with Join being an active query requiring a response from the thread itself? Is there a Join alternative which doesn't care what the thread it is waiting on is doing - something which just waits for whatever time and then returns - J... 2012-04-05 20:09
Ads