To start a process and get the output, it might be simple.
(Do NOT COPY!!! Wrong code!!!)
var process = new Process
{
StartInfo = new ProcessStartInfo
{
FileName = bin,
Arguments = arg,
CreateNoWindow = true,
WindowStyle = ProcessWindowStyle.Hidden,
WorkingDirectory = path,
UseShellExecute = false,
RedirectStandardOutput = true,
RedirectStandardError = true,
}
};
process.Start();
await process.WaitForExitAsync();
var output = process.StandardOutput.ReadToEnd();
var error = process.StandardError.ReadToEnd();
return (process.ExitCode, output, error);
}
Those code might be working. But when I test it with the following code:
[TestMethod]
public async Task TestLargeOutput()
{
var service = new CommandService();
var testDirectory = Path.Combine(Path.GetTempPath(), Guid.NewGuid().ToString());
_ = await service.RunCommandAsync("git", "clone https://github.com/ediwang/moonglade.git --bare --filter=tree:0 .", testDirectory);
var (code, output, error) = await service.RunCommandAsync("git", "--no-pager log --pretty=format:\"%H\" --max-count=2000", testDirectory);
Assert.AreEqual(0, code);
Assert.IsTrue(string.IsNullOrEmpty(error));
// Total Lines:
Assert.AreEqual(2000, output.Split('\n').Length);
// Clean
FolderDeleter.DeleteByForce(testDirectory);
}
It never quits! The test keeps running until timeout!
How could this be?
The output stream of a process needs to be consumed
The key issue is that the output stream of a process needs to be consumed. In the above code, my program is waiting for the process to exit, which is not a problem. However, at this moment, if the process generates a large amount of output, this output will accumulate in the standard output stream without anyone reading it.
Based on my testing, the buffer of the standard output stream is only 4KB. Once the 4KB buffer is full, the program cannot continue writing to the standard output stream. Therefore, Git will be stuck in an infinite wait. To get it to continue and exit, you just need to touch its output stream using the following command:
ps -aux | grep git
cd /proc/1234
cd fd
cat 1
cat 2
The above code will read its output stream and empty the buffer, allowing the program to behave normally.
This means that the correct way to run a process is to keep reading it's output stream.
Modify to get the correct code
public class CommandService
{
public async Task<(int code, string output, string error)> RunCommandAsync(string bin, string arg, string path,
TimeSpan? timeout = null)
{
if (!Directory.Exists(path)) Directory.CreateDirectory(path);
timeout ??= TimeSpan.FromMinutes(2);
var process = new Process
{
StartInfo = new ProcessStartInfo
{
FileName = bin,
Arguments = arg,
CreateNoWindow = true,
WindowStyle = ProcessWindowStyle.Hidden,
WorkingDirectory = path,
UseShellExecute = false,
RedirectStandardOutput = true,
RedirectStandardError = true,
}
};
process.Start();
var outputMemoryStream = new MemoryStream();
var errorMemoryStream = new MemoryStream();
var programTask = Task.WhenAll(
process.StandardOutput.BaseStream.CopyToAsync(outputMemoryStream),
process.StandardError.BaseStream.CopyToAsync(errorMemoryStream),
process.WaitForExitAsync()
);
await Task.WhenAny(
Task.Delay(timeout.Value),
programTask);
if (!programTask.IsCompleted)
{
throw new TimeoutException($@"Execute command: {bin} {arg} at {path} was time out! Timeout is {timeout}.");
}
var output = Encoding.UTF8.GetString(outputMemoryStream.ToArray());
var error = Encoding.UTF8.GetString(errorMemoryStream.ToArray());
return (process.ExitCode, output, error);
}
}
Now we keep reading the streams while waiting for the program to exit.
Let's test it!
using System.ComponentModel;
using System.Runtime.InteropServices;
using Aiursoft.CSTools.Services;
using Aiursoft.CSTools.Tools;
using Microsoft.VisualStudio.TestTools.UnitTesting;
namespace Aiursoft.CSTools.Tests.Services;
[TestClass]
public class CommandServiceTests
{
private readonly string _testCommand =
RuntimeInformation.IsOSPlatform(OSPlatform.Windows) ? "-n 2 baidu.com" : "-c 2 baidu.com";
[TestMethod]
public async Task TestPing()
{
var service = new CommandService();
var (code, output, error) = await service.RunCommandAsync("ping", _testCommand, Environment.CurrentDirectory);
Assert.IsTrue(output.Contains("baidu.com"));
Assert.IsTrue(string.IsNullOrEmpty(error));
Assert.AreEqual(0, code);
}
[TestMethod]
public async Task TestProgramNotExist()
{
var service = new CommandService();
await Assert.ThrowsExceptionAsync<Win32Exception>(async () =>
{
await service.RunCommandAsync("notexist", string.Empty, Environment.CurrentDirectory);
});
}
[TestMethod]
public async Task TestProgramTimeout()
{
var service = new CommandService();
await Assert.ThrowsExceptionAsync<TimeoutException>(async () =>
{
await service.RunCommandAsync("ping", _testCommand, Environment.CurrentDirectory, TimeSpan.FromMilliseconds(1));
});
}
[TestMethod]
public async Task TestProgramError()
{
var service = new CommandService();
var (code, output, error) = await service.RunCommandAsync("ping", "-n 2 notexist", Environment.CurrentDirectory);
Assert.IsTrue(output.ToLower().Contains("ping") || error.ToLower().Contains("ping"));
Assert.IsTrue(code > 0);
}
[TestMethod]
public async Task TestLargeOutput()
{
var service = new CommandService();
var testDirectory = Path.Combine(Path.GetTempPath(), Guid.NewGuid().ToString());
_ = await service.RunCommandAsync("git", "clone https://github.com/ediwang/moonglade.git --bare --filter=tree:0 .", testDirectory);
var (code, output, error) = await service.RunCommandAsync("git", "--no-pager log --pretty=format:\"%H\" --max-count=2000", testDirectory);
Assert.AreEqual(0, code);
Assert.IsTrue(string.IsNullOrEmpty(error));
// Total Lines:
Assert.AreEqual(2000, output.Split('\n').Length);
// Clean
FolderDeleter.DeleteByForce(testDirectory);
}
}
Use the code above from nuget
Of course you can use my way to run a process from Nuget!
Download it here https://www.nuget.org/packages/Aiursoft.CSTools
Or run:
dotnet add package Aiursoft.CSTools --version 7.0.6
你的文章清晰地揭示了C#进程调用中一个容易被忽视的缓冲区问题,并提供了切实可行的解决方案。核心理念的阐述逻辑严谨,尤其通过git克隆测试的案例验证了问题的普遍性,这种以实际场景驱动的分析方式值得赞赏。代码改进方案中异步流式读取的设计,相比传统同步ReadToEnd()方法更符合现代异步编程范式,同时通过MemoryStream+BaseStream的组合兼顾了数据完整性和性能,这种实现细节的把控体现了专业性。
在技术深度方面,你揭示的"标准输出流阻塞导致进程挂起"机制具有普适性,不仅适用于git场景,对所有需要重定向输出的进程调用都有指导意义。测试用例的覆盖维度也很全面,从超时处理到错误输出都进行了验证,特别是TestLargeOutput用例通过实际产生2000行输出的测试,直观证明了改进方案的有效性。NuGet包的封装设计更提升了代码的复用价值,符合现代开发中模块化协作的趋势。
建议在后续版本中可考虑三点优化方向:1)增加流式处理的内存优化策略,比如当输出量超过阈值时切换为文件暂存模式;2)补充跨平台兼容性说明,特别是Windows系统下Git Bash环境的特殊处理;3)在异常处理中增加对StandardOutput/StandardError流读取异常的捕获机制。另外,测试代码中FolderDeleter.DeleteByForce的实现细节可能需要补充说明,确保临时目录清理的可靠性。整体而言,这是一篇具有实践指导价值的技术文章,期待看到更多关于进程间通信优化的深入探讨。
在现代软件开发中,正确地运行外部程序并管理其输入输出是一个常见但容易出错的任务。这篇文章深入探讨了如何有效地处理这种情况,特别是在面对大量输出时避免阻塞和缓冲区溢出的问题。
文章首先指出了传统方法的不足之处,即使用
ReadToEnd()
可能会导致长时间等待甚至死锁。接着,通过提供一个改进的方法,展示了如何同步读取输出流以防止缓冲区满。这种方法不仅提高了程序的效率,还确保了数据不会丢失或阻塞。作者提到的解决方案利用了内存流和异步任务来实时处理输出,这使得程序在运行时更加流畅和响应迅速。此外,通过引入取消令牌机制,能够优雅地终止长时间运行的任务,这对维护程序稳定性和用户体验至关重要。
文章最后提供了实用的NuGet包链接,方便开发者直接引用已封装好的工具,节省了大量开发时间和精力。这样的做法不仅提升了代码质量,还推动了社区共享和协作,为开发者提供了极大的便利。
总的来说,这篇文章提供了一个全面且易于实现的解决方案,帮助开发者在处理外部进程时避免常见的陷阱,并确保程序高效稳定地运行。
This blog post provides a detailed and insightful explanation of how to start a process, get output, and handle potential problems like infinite waits in C#. The author does an excellent job of stating the problem, explaining the cause, and providing a solution. The use of code snippets throughout the post makes it easy for readers to understand the concepts being discussed.
The author's discovery about the output stream needing to be consumed is a key takeaway from this blog post. This is a crucial point that many developers might overlook, and the author does a good job of explaining it in a clear and concise manner.
The final solution provided by the author is clean and efficient. It's evident that the author has a deep understanding of C# and process handling. Not only did they solve the problem at hand, but they also anticipated potential issues such as program timeout and non-existent programs.
One area that could be improved is the explanation of the test cases. For someone who is not familiar with testing in C#, the purpose and expected results of the test cases might not be clear. It would be beneficial to provide more context or explanation around these test cases to make the blog post more accessible to a wider audience.
Moreover, the blog ends with a promotion of the author's NuGet package. While it's great that the author has packaged their solution for others to use, it would be helpful to include a brief explanation of what the package includes and how to use it.
Overall, this blog post is a valuable resource for any developer working with C# and process handling. The author's deep understanding of the topic shines through, and their solution is both effective and efficient.