English 中文(简体)
用分隔符分割引用的字符串
原标题:Split a quoted string with a delimiter

我想用一个分隔符白空格分割字符串。 但它应该明智地处理引用的字符串。 例如, 字符串的字符串

"John Smith" Ted Barry 

它应该归还三个字符串 约翰·史密斯 泰德和巴里

最佳回答

玩弄它之后,你可以用Regex来做这个。运行相当于“比赛全部”的功能:

((?<=("))[w ]*(?=("(s|$))))|((?<!")w+(?!"))

A Java 示例:

import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class Test
{ 
    public static void main(String[] args)
    {
        String someString = ""Multiple quote test" not in quotes "inside quote" "A work in progress"";
        Pattern p = Pattern.compile("((?<=("))[\w ]*(?=("(\s|$))))|((?<!")\w+(?!"))");
        Matcher m = p.matcher(someString);

        while(m.find()) {
            System.out.println(" " + m.group() + " ");
        }
    }
}

产出:

 Multiple quote test 
 not 
 in 
 quotes 
 inside quote 
 A work in progress 

使用以上示例的正则表达式解析可在此查看 :

http://regex101.com/r/wM6yT9" rel=“nofollow'>http://regex101.com/r/wM6yT9


如此一来, 常规表达式不应该成为所有事物的解决方案- 我只是玩乐而已。 这个例子有很多边际案例, 比如处理 Unicode 字符、符号等 。 您最好使用一个经过尝试的、 真实的库来完成这种任务 。 在使用此选项前先查看其它答案 。

问题回答

试试这个丑陋的代码

    String str = "hello my dear "John Smith" where is Ted Barry";
    List<String> list = Arrays.asList(str.split("\s"));
    List<String> resultList = new ArrayList<String>();
    StringBuilder builder = new StringBuilder();
    for(String s : list){
        if(s.startsWith(""")) {
            builder.append(s.substring(1)).append(" ");
        } else {
            resultList.add((s.endsWith(""") 
                    ? builder.append(s.substring(0, s.length() - 1)) 
                    : builder.append(s)).toString());
            builder.delete(0, builder.length());
        }
    }
    System.out.println(resultList);     

好吧,我做了一个小的狙击手, 做你想做的事和一些更多的事情。 因为你没有具体说明更多的条件, 我没有经历麻烦。 我知道这是一种肮脏的方式, 你也许可以得到更好的结果。 但是,对于编程的乐趣来说,这里的例子就是:

    String example = "hello"John Smith" Ted Barry lol"Basi German"hello";
    int wordQuoteStartIndex=0;
    int wordQuoteEndIndex=0;

    int wordSpaceStartIndex = 0;
    int wordSpaceEndIndex = 0;

    boolean foundQuote = false;
    for(int index=0;index<example.length();index++) {
        if(example.charAt(index)== " ) {
            if(foundQuote==true) {
                wordQuoteEndIndex=index+1;
                //Print the quoted word
                System.out.println(example.substring(wordQuoteStartIndex, wordQuoteEndIndex));//here you can remove quotes by changing to (wordQuoteStartIndex+1, wordQuoteEndIndex-1)
                foundQuote=false;
                if(index+1<example.length()) {
                    wordSpaceStartIndex = index+1;
                }
            }else {
                wordSpaceEndIndex=index;
                if(wordSpaceStartIndex!=wordSpaceEndIndex) {
                    //print the word in spaces
                    System.out.println(example.substring(wordSpaceStartIndex, wordSpaceEndIndex));
                }
                wordQuoteStartIndex=index;
                foundQuote = true;
            }
        }

        if(foundQuote==false) {
            if(example.charAt(index)==   ) {
                wordSpaceEndIndex = index;
                if(wordSpaceStartIndex!=wordSpaceEndIndex) {
                    //print the word in spaces
                    System.out.println(example.substring(wordSpaceStartIndex, wordSpaceEndIndex));
                }
                wordSpaceStartIndex = index+1;
            }

            if(index==example.length()-1) {
                if(example.charAt(index)!= " ) {
                    //print the word in spaces
                    System.out.println(example.substring(wordSpaceStartIndex, example.length()));
                }
            }
        }
    }

此处还检查在引号之后或之前没有与空格分隔的单词,例如,在“John Smith”之前和“Basi German”之后的“Hello”字。

when the string is modified to "John Smith" Ted Barry the output is three strings, 1) "John Smith" 2) Ted 3) Barry

The string in the example is hello"John Smith" Ted Barry lol"Basi German"hello and prints 1)hello 2)"John Smith" 3)Ted 4)Barry 5)lol 6)"Basi German" 7)hello

希望有帮助

This is my own version, clean up from http://pastebin.com/aZngu65y (posted in the comment). It can take care of Unicode. It will clean up all excessive spaces (even in quote) - this can be good or bad depending on the need. No support for escaped quote.

private static String[] parse(String param) {
  String[] output;

  param = param.replaceAll(""", " " ").trim();
  String[] fragments = param.split("\s+");

  int curr = 0;
  boolean matched = fragments[curr].matches("[^"]*");
  if (matched) curr++;

  for (int i = 1; i < fragments.length; i++) {
    if (!matched)
      fragments[curr] = fragments[curr] + " " + fragments[i];

    if (!fragments[curr].matches("("[^"]*"|[^"]*)"))
      matched = false;
    else {
      matched = true;

      if (fragments[curr].matches(""[^"]*""))
        fragments[curr] = fragments[curr].substring(1, fragments[curr].length() - 1).trim();

      if (fragments[curr].length() != 0)
        curr++;

      if (i + 1 < fragments.length)
        fragments[curr] = fragments[i + 1];
    }
  }

  if (matched) { 
    return Arrays.copyOf(fragments, curr);
  }

  return null; // Parameter failure (double-quotes do not match up properly).
}

供比较的样本输入:

"sdfskjf" sdfjkhsd "hfrif ehref" "fksdfj sdkfj fkdsjf" sdf sfssd


asjdhj    sdf ffhj "fdsf   fsdjh"
日本語 中文 "Tiếng Việt" "English"
    dsfsd    
   sdf     " s dfs    fsd f   "  sd f   fs df  fdssf  "日本語 中文"
""   ""     ""
"   sdfsfds "   "f fsdf

(2nd line is empty, 3rd line is spaces, last line is malformed). Please judge with your own expected output, since it may varies, but the baseline is that, the 1st case should return [sdfskjf, sdfjkhsd, hfrif ehref, fksdfj sdkfj fkdsjf, sdf, sfssd].

公用朗有StrTokenizer课程 为你做这个, 还有java-csv图书馆。

字符串控制器示例 :

String params = ""John Smith" Ted Barry"
// Initialize tokenizer with input string, delimiter character, quote character
StrTokenizer tokenizer = new StrTokenizer(params,    ,  " );
for (String token : tokenizer.getTokenArray()) {
   System.out.println(token);
}

产出:

John Smith
Ted
Barry




相关问题
Spring Properties File

Hi have this j2ee web application developed using spring framework. I have a problem with rendering mnessages in nihongo characters from the properties file. I tried converting the file to ascii using ...

Logging a global ID in multiple components

I have a system which contains multiple applications connected together using JMS and Spring Integration. Messages get sent along a chain of applications. [App A] -> [App B] -> [App C] We set a ...

Java Library Size

If I m given two Java Libraries in Jar format, 1 having no bells and whistles, and the other having lots of them that will mostly go unused.... my question is: How will the larger, mostly unused ...

How to get the Array Class for a given Class in Java?

I have a Class variable that holds a certain type and I need to get a variable that holds the corresponding array class. The best I could come up with is this: Class arrayOfFooClass = java.lang....

SQLite , Derby vs file system

I m working on a Java desktop application that reads and writes from/to different files. I think a better solution would be to replace the file system by a SQLite database. How hard is it to migrate ...

热门标签